HD28 

.M414 


WORKING  PAPER 
ALFRED  P.  SLOAN  SCHOOL  OF  MANAGEMENT 


ECONOMETRIC  EVALUATION  OF 
ASSET  PRICING  MODELS 

by 

Lars  Peter  Hansen 
John  Heaton 
Erzo  Luttmer 


WP#  3606-93 


August  1993 


MASSACHUSETTS 

INSTITUTE  OF  TECHNOLOGY 

50  MEMORIAL  DRIVE 

CAMBRIDGE,  MASSACHUSETTS  02139 


ECONOMETRIC  EVALUATION  OF 
ASSET  PRICING  MODELS 

by 

Lars  Peter  Hansen 
John  Heaton 
Erzo  Luttmer 

WP#  3606-93  August  1993 


M.I.T.  LIBRARIES 

OCT  0  1 1993 
RECEIVED 


Econometric  Evaluation  of  Asset  Pricing  Models 


August  1993 


Lars  Peter  Hansen 
University  of  Chicago,  NBER  and  NORC 


John  Heaton 
M.I. I.  and  NBER 


Erzo  Luttmer 
Northwestern  University 


We  thank  Craig  Burnside,  Bo  Honore,  Andrew  Lo,  Marc  Roston,  Whitney  Newey, 
Jose  Scheinkman,  Jean-Luc  Vila,  Jiang  Wang,  Amir  Yaron  and  especially  Ravi 
Jagannathan  for  helpful  comments  and  discussions.  We  also  received  valuable 
remarks  from  seminar  participants  at  the  1992  meetings  of  the  Society  of 
Economic  Dynamics  and  Control,  the  1992  NBER  Summer  Institute  and  at 
Cornell,  Duke,  L.S.E.  and  Waterloo  Universities.  Finally,  we  gratefully 
acknowledge  the  financial  assistance  of  the  International  Financial  Services 
Research  Center  at  M.I.T.  (Heaton)  and  the  National  Science  Foundation 
(Hansen)  and  the  Sloan  Foundation  (Luttmer). 


Abstract 

In  this  paper  we  provide  econometric  tools  for  the  evaluation  of 
intertemporal  asset  pricing  models  using  specification-error  and  volatility 
bounds.  We  formulate  analog  estimators  of  these  bounds,  give  conditions  for 
consistency  and  derive  the  limiting  distribution  of  these  estimators.  The 
analysis  incorporates  market  frictions  such  as  short-sale  constraints  and 
proportional  transactions  costs.  Among  several  applications  we  show  how  to 
use  the  methods  to  assess  specific  asset  pricing  models  and  to  provide 
nonparametric  characterizations  of  asset  pricing  anomalies. 


I.   Introduction 

Frictionless  market  models  of  asset  pricing  imply  that  asset  prices  can 
be  represented  by  a  stochastic  discount  factor  or  pricing  kernel.  For 
example,  in  the  Capital  Asset  Pricing  Model  (CAPM)  the  discount  factor  is 
given  by  a  constant  plus  a  scale  multiple  of  the  return  on  the  market 
portfolio.  In  the  Consumption-Based  CAPM  (CCAPM)  the  discount  factor  is 
given  by  the  intertemporal  marginal  rate  of  substitution  of  an  investor.  If 
r  is  the  net  return  on  an  asset  and  m  is  the  marginal  rate  of  substitution, 
then  the  CCAPM  implies  that: 

(l.n     1   =  £[m(l+r)|^] 

where  ?  is  the  information  set  of  the  investor  today.  More  generally,  if  m 
is  the  stochastic  discount  factor,  today's  price,  nip),  of  an  asset  payoff, 
p,    tomorrow  is  given  by: 

(1.2)     n(p)      =  £(mp|^)  . 

Thus  a  stochastic  discount  factor  m  "discounts"  payoffs  in  each  state  of  the 
world  and,  as  a  consequence,  adjusts  the  price  according  to  the  riskiness  of 
the  payoff.  From  the  vantage  point  of  an  empirical  analysis,  we  envision  the 
stochastic  discount  factor  as  the  vehicle  linking  a  theoretical  model  to 
observable  implications. 

Given  a  particular  model  for  the  stochastic  discount  factor,  the 
implications  of  (1.2)  can  be  assessed  by  first  taking  unconditional 
expectations,  yielding 


(1.3)     Enip)     =  Eimp). 

When  m  is  observable  (at  least  up  to  a  finite-dimensional  parameter  vector) 
by  the  econometrician,  a  test  of  (1.2)  can  be  performed  using  a  time  series 
of  a  vector  of  portfolio  payoffs  and  prices  by  examining  whether  the  sample 
analogs  of  the  left  and  right  sides  of  (1.2)  are  significantly  different  from 
each  other.  Examples  of  this  type  of  procedure  can  be  found  in  Hansen  and 
Singleton  (1982),  Brown  and  Gibbons  (1985),  MacKinlay  and  Richardson  (1991) 
and  Epstein  and  Zin  (1991). 

While  tests  such  as  these  can  be  informative,  it  is  often  difficult  to 
interpret  the  resulting  statistical  rejections.  Further,  these  tests  are  not 
directly  applicable  when  there  are  market  frictions  such  as  transactions 
costs  or  short-sale  constraints.  For  example,  when  an  asset  cannot  be  sold 
short,  (1.2)  is  replaced  with  the  pricing  inequality: 

(1.4)     7t(p)   2:  E{mp\9)      . 

Finally,  these  tests  can  not  be  used  when  the  candidate  discount  factor 
depends  on  variables  unavailable  to  the  econometrician. 

As  an  alternative  to  testing  directly  pricing  errors  using  (1.3),  we 
consider  a  different  set  of  tests  and  diagnostics  using  the 
specification-error  bounds  of  Hansen  and  Jagannathan  (1993),  and  the 
volatility  bounds  of  Hansen  and  Jagannathan  (1991).  We  also  consider 
extensions  of  these  tests  and  diagnostics,  developed  by  He  and  Modest  (1992) 
and  Luttmer  (1993),  that  handle  transactions  costs,  short-sale  restrictions 
and  other  market  frictions.   We  develop  an  econometric  methodology  to  provide 


consistent  estimators  of  the  specification-error  and  volatility  bounds. 
Further,  we  develop  asymptotic  distribution  theory  that  is  easy  to  implement 
and  that  can  be  used  to  make  statistical  inferences  about  asset  pricing 
models  and  asset  market  data  using  the  bounds.  The  specification-error  and 
volatility  bounds,  along  with  the  econometric  methodology  that  we  develop, 
can  be  applied  to  address  several  related  issues. 

The  specification-error  bounds  of  Hansen  and  Jagannathan  (1993)  can  be 
used  to  examine  a  discount  factor  proxy  that  does  not  necessarily  correctly 
price  the  assets  under  consideration  (see  also  Bansal,  Hsieh  and  Viswanathan 
1992  for  an  application).  This  is  important  since  formal  statistical  tests 
of  many  particular  models  of  asset  pricing  imply  that  the  hypothesis  that 
their  pricing  errors  are  zero  is  a  very  low  probability  event.  Since  these 
models  are  typically  very  simple,  it  is  perhaps  not  surprising  that  they  do 
not  completely  capture  the  complexity  of  pricing  in  financial  markets.  The 
specification-error  bounds  give  measures  of  the  maximum  pricing  error  made  by 
the  discount  factor  proxy.  This  provides  a  way  to  assess  the  usefulness  of  a 
model  even  when  it  is  technically  misspecif led.  Further,  this  tool  can 
easily  accommodate  market  frictions  such  as  transactions  costs  and  short-sale 
constraints. 

Given  a  vector  of  asset  payoffs  and  prices,  (1.3)  typically  does  not 
uniquely  determine  m.  Instead  there  is  a  whole  family  of  m' s  that  will 
work.  Any  parametric  model  for  m  imposes  additional  restrictions  on  that 
family,  often  sufficient  to  identify  a  unique  stochastic  discount  factor. 
Rather  than  imposing  these  extra  restrictions,  Hansen  and  Jagannathan  (1991) 
showed  how  asset  market  data  on  payoffs  and  prices  can  be  used  to  construct 
feasible  sets  for  means  and  standard  deviations  of  stochastic  discount 
factors.   The  boundary  points  of  these  regions  provide  lower  bounds  on  the 


volatility  (standard  deviation)  indexed  by  the  mean.  He  and  Modest  (1992) 
and  Luttmer  (IS  3)  showed  how  to  extend  this  analysis  to  the  case  where  some 
of  the  assets  are  subject  to  transactions  costs  or  short-sales  constraints. 

These  fea-  Ue  sets  of  means  and  standard  deviations  of  the  stochastic 
discount  factor  can  be  used  to  isolate  those  aspects  of  the  asset  market  data 
that  are  most  informative  about  the  stochastic  discount  factor.  One  way  to 
do  this  is  to  ask  whether  the  volatility  bound  becomes  significantly  sharper 
as  more  asset  market  data  is  added  to  the  analysis.  This  would  help  one 
assess  the  incremental  importance  of  additional  security  market  data  in  an 
econometric  analysis  without  having  to  limit  a  priori  the  family  of 
stochastic  discount  factors.  More  generally,  it  is  valuable  to  have  a 
characterization  of  the  sense  in  which  an  asset  market  data  set  is  puzzling 
without  having  to  take  a  precise  stand  on  the  underlying  valuation  model. 

When  testing  a  particular  model  of  asset  pricing  in  which  the  candidate 
m  is  specified,  it  is  often  useful  to  examine  whether  the  candidate  is  in  the 
feasible  region.  Moreover,  when  diagnosing  the  failures  of  a  specific  model, 
it  is  valuable  to  determine  whether  the  candidate  discount  factor  is  not 
sufficiently  volatile  or  wh  \er  it  is  other  aspects  of  the  Joint 
distribution  of  asset  payoffs  and  the  candidate  discount  factor  that  are 
problematic. 

As  we  remarked  previously,  sometimes  it  is  not  possible  to  construct 
direct  observations  of  m,  making  pricing-error  tests  infeasible.  However, 
it  may  still  be  possible  to  calculate  the  moments  of  a  stochastic  discount 
factor  implied  by  a  model  which  can  then  be  compared  to  the  volatility 
bounds.  For  example  in  Heaton  (1993)  a  consumption-based  CAPM  model  is 
examined  in  which  the  cous-umption  lata  is  time  averaged  and  preferences  are 
such  that  a  simple  linearization  of  the  utility  function  can  not  be  done  to 


consistency  result  for  estimators  of  the  arbitrage  bounds.  Those 
uninterested  in  the  consistency  results,  but  who  are  interested  in  the 
calculations  necessary  for  conducting  statistical  inference,  need  only  read 
Sections  III. A.,  III.C  and  III.D  before  moving  to  Section  IV. 

In  Section  IV  we  present  several  applications  and  extensions  of  our 
results  each  of  which  can  be  read  independently  after  reading  Sections  II, 
III. A,  III.C  and  III.D.  In  Section  IV. A  we  discuss  the  sense  in  which  the 
entire  feasible  set  of  means  and  standard  deviations  for  the  stochastic 
discount  factor  can  be  estimated.  Section  IV.  B  provides  a  discussion  of 
tests  of  whether  the  volatility  bound  becomes  sharper  with  additional  asset 
market  data.  Section  IV.  C  shows  how  to  use  the  volatility  bounds  to  test 
models  of  the  discount  factor.  Finally  in  Section  IV.  D  we  extend  the 
specification-error  bound  to  a  case  where  there  are  parameters  of  the 
discount  factor  proxy  that  are  unknown  and  must  be  estimated.  Section  V 
contains  some  concluding  remarks. 

II.   General  Model  and  Bounds 

Our  starting  point  is  a  model  in  which  asset  prices  are  represented  by 
a  stochastic  discount  factor  or  pricing  kernel.  To  accommodate  security 
market  pricing  subject  to  transactions  costs,  we  permit  there  to  be 
short-sale  constraints  for  a  subset  of  the  securities.  Although  a 
short-sale  constraint  is  an  extreme  version  of  a  transactions  cost,  other 
proportional  transactions  costs  such  as  bid-ask  spreads  can  also  be  handled 
with  this  formalism.  This  is  done  as  in  Foley  (1970),  Jouni  and  Kallal 
(1992)  and  Luttmer  (1993)  by  constructing  two  payoffs  according  to  whether  a 
security  is  purchased  or  sold.   A  short-sale  constraint  is  imposed  on  both 


artificial  securities  to  enforce  the  distinction  between  a  buy  and  a  sell, 
and  a  bid-ask  spread  is  modeled  by  making  the  purchase  price  higher  than  the 
sale  price. 

Suppose  the  vector  of  security  market  payoffs  used  in  an  econometric 
analysis  is  denoted  x  with  a  corresponding  price  vector  q.  The  vector  of 
X  is  used  to  generate  a  collection  of  payoffs  formed  using  portfolio  weights 
in  a  closed  convex  cone  C  of  IR  : 

(2.1)    P     =      {p   :    p  =  a' X   for  some  a  e  C>. 

The  cone  C  is  constructed  to  incorporate  all  of  the  short-sale  constraints 
imposed  in  the  econometric  investigation.  If  there  are  no  price  distortions 
induced  by  market  frictions,  then  C  is  r".  More  generally,  partition  x  into 
two  components:  x'  =  [x"',x°']  where  x"  contains  the  k  components  whose 
prices  are  not  distorted  by  market  frictions  and  x  contains  the  I  components 
subject  to  short-sale  constraints.  Then  the  cone  C  is  formed  by  taking  the 
Cartesian  product  of  R  and  the  nonnegative  orthant  of  R  . 

Let  q  denote  the  random  vector  of  prices  corresponding  to  the  vector  x 
of  securities  payoffs.  These  prices  are  observed  by  investors  at  the  time 
assets  are  traded  and  are  permitted  to  be  random  because  the  prices  may 
reflect  conditioning  information  available  to  the  investors.  Since  it  is 
difficult  to  model  empirically  this  conditioning  information,  we  instead  work 
with  the  average  or  expected  price  vector  Eq. 

While  information  may  be  lost  in  our  failure  to  model  explicitly  the 
conditioning  information  of  investors,  some  conditioning  information  can  be 
incorporated  in  the  following  familiar  ad  hoc  manner.  Suppose  some  of  the 
security  payoffs  used  in  an  econometric  analysis  are  one-period  stock  or 


bond-market  returns  with  prices  equal  to  one  by  construction.  Additional 
synthet ic  payoffs  can  be  formed  by  an  econometrician  by  taking  one  of  the 
original  returns,  say  x  ,  and  multiplying  it  by  a  random  variable,  say  z,  in 
the  conditioning  information  set  of  economic  agents.  The  corresponding 
constructed  payoff  is  then  x  z  with  a  price  of  z.  Hence  the  price  of  the 
synthetic  payoff  is  random  even  though  the  price  of  original  security  is 
constant.  If  x  is  subject  to  a  short-sale  constraint,  then  z  should  be 
nonnegative. 

The  vehicle  linking  payoffs  to  average  prices  is  a  stochastic  discount 
factor.  To  represent  formally  this  link  and  provide  a  characterization  of  a 
stochastic  discount  factor,  we  introduce  the  dual    of  C,    which  we  denote  C  . 

This  dual  consists  of  all  vectors  in  IR  whose  dot  product  with  every  element 

n   • 
of  C   is  nonnegative.   For  instance,  when  C  is  all  of  R  ,  C     consists  only  of 

the  zero  vector.   More  generally,  if  x  can  be  partitioned  in  the  manner 

described  previously,  the  elements  of  C     are  of  the  form  {0,^')'  where  g  is 

nonnegative. 

A  stochastic  discount  factor  m  is  a  random  variable  that  satisfies  the 

pricing  relation: 


(2.2)    Eq   -  Emx   e  C     . 

To  interpret  this  relation,  first  consider  the  case  in  which  C  is  r".  Then 
there  are  no  market  frictions  and  we  have  linear  pricing.  In  this  case 
relation  (2.2)  is  the  familiar  pricing  equality  because  C  has  only  one 
element,  namely  the  zero  vector.  Consider  next  the  case  in  which  x  can  be 
partitioned  into  the  two  components  described  previously.  Partition  q 
comparably,  and  relation  (2.2)  becomes: 


(2.3)    £q"  -  Eflix"  =  0 
Eq^  -  Emx^     a  0. 

The  inequality  restriction  emerges  because  pricing  the  vector  of  payoffs  x^ 
subject  to  short-sale  constraints  must  allow  for  the  possibility  that  these 
constraints  bind  and  hence  contribute  positively  to  the  market  price  vector. 

I I. A:   Maintained  Assumptions 

There  are  three  restrictions  on  the  vector  of  payoffs  and  prices  that 
are  central  to  our  analysis.  The  first  is  a  moment  restriction,  the  second 
is  equivalent  to  the  absence  of  arbitrage  on  the  space  of  portfolio  payoffs, 
and  the  third  eliminates  redundancy  in  the  securities. 

For  pricing  relation  (2.2)  to  have  content,  we  maintain: 


2 
Assumption  2.1:      E\x\      <  a,    E\q\    <   oo. 


2 

Assumption  2.2:      There  exists  an  m  >  0  satisfying  (2.2)  such  that  Em     <   oo. 


The  positivity  component  of  Assumption  2.2  can  often  be  derived  from 
the  Principle  of  No-Arbitrage  (e.g.,  see  Kreps  1981,  Prisman  1986,  Jouni  and 
Kallal  1992  and  Luttmer  1993).  The  Principle  of  No-Arbitrage  specifies  that 
the  smallest  cost  associated  with  any  payoff  that  is  nonnegative  and  not 
identically  equal  to  zero  must  be  strictly  positive.  Notice  that  from 
(2.2),  a  stochastic  discount  factor  m   satisfies: 

(2.4)    a' Eimx)      s  a' Eq     for  any  a  e  C, 


which  shows  that  Assumption  2.2  implies  the  Principle  of  No-Arbitrage 
(applied  to  expected  prices). 

Next  we  limit  the  construction  of  x  by  ruling  out  redundancies  in  the 
securities: 

•  •  • 

Assumption  2.3:       If  a'x  =  a  'x  and  oc' Eq  =  a  '  Eq   for  some  a  and  a  in  C,    then 

a  =  a  . 

In  the  absence  of  transaction  costs,  Assumption  2.3  precludes  the 
possibility  that  the  second  moment  matrix  of  x  is  singular.  Otherwise, 
there  would  exist  a  nontrivial  linear  combination  of  the  payoff  vector  x 
that  is  zero  with  probability  one.  In  light  of  (2.2),  the  (expected)  price 
of  this  nontrivial  linear  combination  would  have  to  be  zero,  violating 
Assumption  2.3.  To  accommodate  securities  whose  purchase  price  differs  from 
the  sale  price,  we  permit  the  second  moment  matrix  of  the  composite  vector  x 
to  be  singular.    Assumption  2.3  then  requires  that  distinct  portfolio 

weights  used  to  construct  the  same  payoff  must  have  distinct  expected 

1 
prices. 

II. B:   Minimum  Distance  Problems 

There  are  two  problems  that  underlie  most  of  our  analysis.  Let  M 
denote  the  set  of  all  random  variables  with  finite  second  moments  that 
satisfy  (2.2),  and  let  M*  be  the  set  of  all  nonnegative  random  variables  in 
M.  In  light  of  Assumption  2.2,  both  sets  are  nonempty.  Let  y  denote  some 
"proxy"  variable  for  a  stochastic  discount  factor  that,  strictly  speaking, 
does  not  satisfy  relations  (2.2).   Following  Hansen  and  Jagannathan  (1993), 


10 


we   consider   the   following   two  ad      hoc      least   squares   measures   of 
misspecif ication: 


(2.5)    8^     =  min   £[(y  -  m)^]    , 
meM 


and 


(2.6)    5^  =  minjliy  -  m)^] 
meM 


When  the  proxy  y  is  set  to  zero,  the  minimization  problems  collapse  to 
finding  bounds  on  the  second  moment  of  stochastic  discount  factors  as 
constructed  by  Hansen  and  Jagannathan  (1991),  He  and  Modest  (1992)  and 
Luttmer  (1993).  In  particular,  the  bounds  derived  in  Hansen  and  Jagannathan 
(1991)  are  obtained  by  setting  y  to  zero  and  solving  (2.5)  and  (2.6)  when 
there  are  no  short-sale  constraints  imposed  (when  C  is  set  to  IR  ) ;  the  bound 
derived  in  He  and  Modest  is  obtained  by  solving  (2.5)  for  y  set  to  zero;  and 
the  bound  derived  by  Luttmer  (1993)  is  obtained  by  solving  (2.6)  for  y  set  to 
zero.  These  second  moment  bounds  will  subsequently  be  used  in  deriving 
feasible  regions  for  means  and  standard  deviations.  Clearly,  the  second 
moment  bound  implied  by  (2.6)  is  no  smaller  than  that  implied  by  (2.5)  since 
it  is  obtained  using  a  smaller  constraint  set. 

Next  consider  the  case  in  which  the  proxy  y  is  not  degenerate.  Hansen 
and  Jagannathan  (1993)  showed  that  the  least  squares  distance  between  a 
proxy  and  the  set  M  of  (possibly  negative)  stochastic  discount  factors  has 
an  alternative  interpretation  of  being  the  maximum  pricing  error  per  unit 
norm  of  payoffs  '"  P,  where  the  norm  of  a  payoff  is  the  square  root  of  its 
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second  moment.  When  the  constraint  set  is  shrunk  to  M  as  in  problem  (2.6), 
the  dual  interpretation  takes  account  of  potential  pricing  errors  for 
hypothetical  derivative  claims.    While  Hansen  and  Jagannathan   (1993) 

abstract  from  short-sale  constraints  in  their  analysis,   pricing-error 

2 

interpretations  are  applicable  more  generally. 

II. C:   Conjugate  Maximization  Problems 

In  solving  the  least  squares  problems  (2.5)  and  (2.6)  and  in  our 
development  of  econometric  methods  associated  with  those  problems,  it  is 
most  convenient  to  study  the  conjugate  maximization  problems.  They  are 
given  by 


(2.7)     5^  =   maxiEy^   -  Eliy   -  x'a.)^]    -  Zol' Eq)      , 
aeC 


and 


(2.8)    S^  =       max   {Ey^   -  £[(y  -  x'a)*^]    -  2a' Eg} 
aeC 


where  the  notation  h  denotes  max{h,0}.  The  conjugate  problems  are  obtained 
by  introducing  Lagrange  multipliers  on  the  pricing  constraints  (2.2)  and 
exploiting  the  familiar  saddle  point  property  of  the  Lagrangian.  The  a' s 
then  have  interpretations  as  the  multipliers  on  the  pricing  constraints. 

The  conjugate  problems  in  (2.7)  and  (2.8)  are  convenient  because  the 
choice  variables  are  finite-dimensional  vectors  whereas  the  choice  variables 
in  the  original  least  squares  problems  are  random  variables  that  reside  in 
possibly  infinite-dimensional  constraint  sets.   The  specifications  of  the 
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conjugate  problems  are  justified  formally  in  Hansen  and  Jagannathan  (1993) 
and  Luttmer  (1993).  Of  particular  interest  to  us  is  that  the  criteria  for 
the  maximization  problems  are  concave  in  a  and  that  the  first-order 
conditions  for  the  solutions  are  given  by: 

(2.9)  Eq   -  £[(y  -  x'a)x]      e  C* 
in  the  case  of  problem  (2.7)  and 

(2.10)  Eq   -  £[(y  -  x'a)*x]      e  C* 

along  with  the  respective  complementary  slackness  conditions 


(2.11)   a'Eq  -   a'£[(y  -  x'a)  x]   =  0, 


and 


(2.12)   a'Eq  -  a'£[(y  -  x'a)*xl   =  0  . 

In  fact,  optimization  problem  (2.7)  is  a  standard  quadratic  programming 
problem.  Interpreting  the  first-order  conditions  for  these  problems,  observe 
that  associated  with  a  solution  to  problem  (2.7)  is  a  random  variable  m  =  {y 
-  x'a)  in  M  and  associated  with  a  solution  to  problem  (2.8)  is  a  nonnegative 
random  variable  m  =  (y  -  x'a)  in  M  .  These  random  variables  are  the  unique 
(up  to  the  usual  equivalence  class  of  random  variables  that  are  equal  with 
probability  one)  solutions  to  the  original  least  squares  problems. 

Since  Assumption  2.3  eliminates  redundant  securities  and  the  random 
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variable  (y  -  x'a)  is  uniquely  determined,  the  solution  a  to  conjugate 
problem  (2.7)  is  also  unique.  This  follows  because  the  value  of  the 
criterion  must  be  the  same  for  all  solutions,  implying  that  they  all  must 
have  the  same  expected  price  a'Eq.  The  solution  to  conjugate  problem  (2.8) 
may  not  be  unique,  however.  In  this  case  the  truncated  random  variable  (y  - 
x'a)  is  uniquely  determined,  as  is  the  expected  price  a'Eq.  On  the  other 
hand,  the  random  variable  (y  -  x'a)  is  not  necessarily  unique,  so  we  can  not 
exploit  Assumption  2.3  to  verify  that  the  solution  a  is  unique.  As  we  will 
now  demonstrate,  the  set  of  solutions  is  convex  and  compact. 

The  convexity  follows  immediately  from  the  concavity  of  the  criterion 
function  and  the  convexity  of  the  constraint  set.  Similarly,  the  set  of 
solutions  must  be  closed  because  the  constraint  set  is  closed  and  the 
criterion  function  is  continuous. 

Boundedness  of  the  set  of  solutions  can  be  demonstrated  by  investigating 
the  tail  properties  of  the  criterion  functions.  We  consider  two  cases: 
directions  9  for  which  9'x  is  negative  with  positive  probability  and 

directions  9  for  which  9'x  is  nonnegative.   To  study  the  former  case  we  take 

2 

the  criterion  in  (2.8)  and  divide  it  by  1  +  |a|  .   For  large  values  of  |a| 

the  scaled  criterion  is  approximately: 

(2.13)    -  £[(-x'9)*^]    where  9  =  a/[l  +  |a|^]^'^^  . 

Hence  |9|  is  approximately  one  for  large  values  of  Ia|.  Moreover,  9'x  is  a 
payoff  in  P.  Consequently,  the  unsealed  criterion  will  decrease  (to  -co) 
quadratically  for  large  values  |a|. 

Consider  next  directions  9  for  which  9'x  is  nonnegative.  From 
Assumption  2.2  and  relation  (2.4)  we  have  that 
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(2.14)    £fli(G'x)   s  e'Eq 

for  some  m  that  is  strictly  positive  with  probability  one.  Hence  e'Eq  must 
be  strictly  positive  unless  G'x  is  identically  zero.  However,  when  G'x  is 
identically  zero,  it  follows  from  Assumption  2.3  and  inequality  (2.14)  that 
e'Eq   is  still  strictly  positive. 

For  directions  9  for  which  the  payoff  e'x  is  nonnegative,  we  study  the 

2  1/2 
tail  behavior  of  the  criterion  after  dividing  by  (1  +  |al  )    ,  which  yields 

approximately  -  e'Eq   for  large  values  of  |a|.   Hence  in  these  directions  the 

the  unsealed  criterion  must  diminish  (to  -m)  at  least  linearly  in  |a|.   Thus 

in  either  case,  we  find  that  the  set  of  solutions  to  conjugate  problem  (2.8) 

is  bounded. 

For  some  but  not  all  of  the  results  in  the  subsequent  sections,  we  will 

need  for  there  to  exist  a  unique  solution  to  conjugate  problem  (2.8).   Since 

the  set  of  solutions  is  convex,  local  uniqueness  implies  global  uniqueness. 

To  display  a  sufficient  condition  for  local  uniqueness,  let  x   denote  the 

component  of  the  composite  payoff  vector  x  for  which  the  pricing  relation  is 

satisfied  with  equality: 


(2.15)   £mx*  =  Eq' 


where  q  is  the  corresponding  price  vector.  Also,  let  lf~>Q»  be  the  the 
indicator  function  for  the  event  {fli>0>.  A  sufficient  condition  for  local 
uniqueness  is  that 


Assumption  2.4:      Ex  x  '^i~yn\    is  nonsingular. 
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To  see  why  this  is  a  valid  sufficient  condition,  observe  that  from  the 
complementary  slackness  condition  (2.12),  m  is  given  by  (y  -  x  '3)  for  some 
vector  p.   Consequently, 


(2.16)   Eq'     =  £yl^~^Q^  -  £(xV' 1^-^^^  )3 


When  the  matrix  £(x  x  '1,~^„.)  is  nonsingular,  we  can  solve  (2.16)  for  S. 

\in>vr 

II. D:   Volatility  Bounds  and  Restrictions  on  Meains 

The  second  moment  bounds  described  in  the  previous  subsection  can  be 
converted  into  standard  deviation  bounds  via  the  formulas: 


(2.17)   0-  =   [5^  -  (£m)^]^^^ 
?  =  [6^  -    (£m)^]^^^ 


"2  ~2 

where  5   and  5   are  constructed  by  setting  the  proxy  to  zero.    When  P 

contains  a  unit  payoff,  Em   is  also  equal  to  the  average  price  of  that  payoff 

and  hence  is  restricted  to  be  between  the  sale  and  purchase  prices  of  the 

unit  payoff.   However,  data  on  the  price  of  a  riskless  payoff  is  often  not 

available,  so  that  it  is  difficult  to  determine  Em.       In  these  circumstances, 

bounds  can  be  obtained  for  each  choice  of  Em    by  adding  a  unit  payoff  to  P 

(augmenting  x  with  a  1)  and  assigning  a  price  of  v   to  that  payoff  (augmenting 

£q  with  v).   In  forming  the  augmented  cone,  there  should  be  no  short  sale 

constraints  imposed  on  the  additional  security  and  hence  no  new  price 

distortions  should  be  introduced.   The  price  assignment  v   is  equivalent  to  a 

mean  assignment  for  m.      Mean-specific  volatility  bounds  can  then  be  obtained 
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using  (2.7).  (2.8)  and  (2.17). 

The  Principle  of  No-Arbitrage  puts  a  limit  on  the  admissible  values  of 
v.    V   e      \  ,v  ]    where  X   is  the  lower  arbitrage  bound  and  i;   is  the  upper 

0   0  0  0  '^ 

arbitrage  bound.   These  bounds  are  computed  using  formulas  familiar  from 
derivative  claims  pricing: 


(2.18)    X       s     -   infia'Eq   :  a  e  C  and  a'x  i  -1> 
and 


(2.19)    V       =      infia'Eq   :    a.  e  C   and  a'x  i  1> 

0 


While  A  is  always  well  defined  via  (2.18),  v  may  not  be  because  there  may 
not  exist  a  payoff  in  P  that  dominates  a  unit  payoff.  In  such 
circumstances,  we  define  v     to  be  +oo. 

0 


III.   Econometric  Issues 

In  this  section  we  develop  consistency  and  asymptotic  distribution 
results  for  the  specification-error  bounds  presented  in  Section  II.  A  key 
presumption  underlying  our  analysis  is  that  the  data  on  asset  payoffs  and 
prices  are  replicated  over  time  in  some  stationary  fashion.  That  is, 
associated  with  the  composite  vector  (x',q',y)'  is  a  stochastic  process 
{(x  ',q  ',y  )'}  whose  sequence  of  empirical  distributions  approximate  the 
joint  distribution  of  (x',q',y)'.  We  denote  integration  with  respect  to  the 
empirical  distribution  for  sample  size  T  as  T-  More  precisely,  for  any  z 
that  is  a  (Borel  measurable!  function  of  (x',q',y)  with  a  finite  first 
moment,  we  will  approximate  Ez   by  Yz   where 
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(3.1)        j;z   =    (i/T)Z^^iZ, 


Among  other  things,  we  require  that  this  approximation  becomes  arbitrarily 
good  as  the  sample  size  T  gets  large.  That  is  we  presume  that  {z  }  obeys  a 
Law  of  Large  Numbers.   A  sufficient  condition  for  this  is: 


Assumption  3.1:    The  composite  process  iix   '  ,q  '  ,y    )}     is  stationary  and 
ergodic. 


Under  this  assumption,  we  can  think  of  (x',q',y)  as  {x  '  ,q  '  ,y  ) . 

0    0    0 

Assumption  3.1  could  be  weakened  in  a  variety  of  ways,  but  it  is  maintained 
for  pedagogical  simplicity.  More  generally,  we  might  imagine  that  the 
process  iix  '  ,q  '  ,y  ))  is  asymptotically  stationary,  where  the  convergence 
to  the  stationary  distribution  is  sufficiently  fast  to  ensure  that  the  Law 
of  Large  Numbers  applies  to  averages  of  the  form  (3.1).  In  this  case,  the 
joint  distribution  of  ix'.q'.y)  is  given  by  the  stationary  limit  point  of 
the  process  {(.x   '  ,q  '  ,y   )} . 

To  estimate  the  specification-error  bounds,  we  suppose  that  a  sample  of 
size  T  is  available  and  that  the  empirical  distribution  implied  by  this  data 
is  used  in  place  of  the  population  distribution.  (Thus  we  are  applying  the 
Analogy  Principle  of  Goldberger  1968  and  Manski  1988).  We  introduce  two 
random  functions  <p   and  4>' 


(3.2)    ^(a)  =     y^  -    iy  -   a'x)^  -  2a'q, 


and 
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(3.3)    $(a)  =     y^  -    iy  -  a.'x)*^   -  2a'q 


The  sample  analog  estimators  of  interest  are  given  by 


(3.4)    (d  )^  =    max     y[<p{a)] 
"^        aeC 


and 


[3.5)    (d  )^  =    max  T^4><^oi] 


III. A:   Consistent  Estimation  of  the  Specification-Error  Bovinds 

We  first  establish  the  statistical   consistency  of   the  estimator 
sequences  id  }   and  {d  >: 

T  T 

Proposition  3.1:  Under  Assumptions  2.1-2.3  and  3.1,  {d  }  and  {d  }  converge 
almost  surely  to  5  and  5,  respectively. 

The  proof  of  this  proposition  is  given  in  Appendix  A.  The  basic  idea  is 
that  the  population  and  sample  criterion  functions  for  the  conjugate 
problems  are  concave  and  the  sets  of  maximizers  are  convex.  By  Assumptions 
2.1  and  3.1,  the  criterion  functions  converge  pointwise  (in  a  and  P)  almost 
surely  to  the  population  criterion  functions  introduced  in  Section  II. C.  In 
light  of  the  concavity  of  the  criterion  functions,  this  convergence  is 
uniform  on  compact  sets  almost  surely  (for  example,  see  Rockafellar  1970). 
Finally,  since  the  sets  of  maximizers  of  the  limiting  criterion  functions 
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are  compact,  for  sufficiently  large  T  one  can  find  a  compact  set  such  that 
the  maximizers  of  the  sample  and  population  criteria  are  contained  in  that 
compact  set  (for  example,  see  Hildenbrand  1974  and  Haberman  1989).  Hence 
the  conclusion  follows  from  the  uniform  convergence  of  the  criteria  on  a 
compact  set. 

III.B:   Asymptotic  Distribution  of  the  Estimators  of  the  Bounds 

We  consider  next  the  limiting  distribution  of  the  analog  estimator 
sequences  of  the  specification-error  bounds.  Our  ability  to  express  the 
objects  of  interest  as  solutions  to  the  conjugate  problems  permits  us  to 
obtain  results  very  similar  to  those  in  the  literature  on  using  likelihood 
ratios  as  devices  for  model  selection  in  environments  when  models  are 
possibly  misspecified  (for  example,  see  Vuong  1989).  We  show  that  when  the 
specification  error  bounds  are  positive,  we  obtain  a  limiting  distribution 
that  is  equivalent  to  the  one  obtained  by  ignoring  parameter  estimation,  and 
when  the  specification  error  bound  is  zero  the  limiting  distribution  is 
degenerate.  (See  Theorem  3.3  of  Vuong  1989  page  307  for  the  corresponding 
result  for  likelihood  ratios. ) 

Let  a  be  a  maximizer  of  r(/>,  a  a  maximizer  of  E<p,  a  a  maximizer  of 
Yi,  and  a  a  maximizer  of  E4>.  To  study  the  limiting  behavior  of  the 
estimators,  we  use  the  decompositions: 


(3.6)    VT[(d  )^  -  5^]  =  VTY^liu^)    -  iCa)]    +  VlY^l^U)    -  E^(a)] 


and 


(3.7)   A((d  )^  -  5^]  =  VTY^[4>U^)   -   0(a)]  +  ^^.[^(a)  -  £^(a)] 


20 


As  we  will  now  demonstrate,  the  limiting  distributions  for  the  maximized 
values  depend  only  on  the  second  terms  of  these  decompositions.  In  other 
words,  the  impact  of  replacing  the  unknown  population  maximizers  by  the 

sample  maximizers  in  the  sample  criterion  functions  is  negligible. 

~  2 
Take  the  case  of  the  sequence  {(d)  >.   Then  by  the  concavity  of  ^,  we 

have  the  following  gradient  inequalities: 


(3.8)    5(a  )  -  0(a)   £   (mx  -  q) ■  U^  -   a) 

T  T 

=   [  (mx  -  q)  -  Eimx   -  q)]-(.a     -   a) 

+  £(mx  -  q) • {a     -  a)    . 


However,  it  follows  from  the  first-order  conditions  (including  the 
complementary  slackness  conditions)  for  the  population  conjugate  problem 
that 


(3.9)    £(mx  -  q)-U     -  a)   =  E[mx  -  q) -a^     ^     0  . 


The  inequality  in  (3.9)  is  obtained  because  £(q  -  mx)    is  in  the  dual  C 
while  a  is  constrained  to  be  in  C.      Combining  (3.8)  and  (3.9)  we  have  that 


(3.10)   0  s     VTY^[4>U^)    -  ^(a)] 

s  VTY^Hmx  -  q)    -   E(.mx   -  q]]-{a^   -  a)      . 

Therefore,  WTT[4>(a  )  -  0(a)]}  converges  in  probability  to  zero  if  the 
sample  counterparts  to  the  pricing  errors  obey  a  Central  Limit  Theorem  and 
the  maximizers  can  be  chosen  so  that  {(a  -  a)}  converges  almost  surely  to 

T 
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zero.  This  latter  convergence  can  be  demonstrated  by  exploiting  the 
concavity  of  the  population  criterion  function  and  the  convexity  of  the 
constraint  set  (for  example,  see  the  discussion  on  page  1635  of  Haberman  1989 
and  Appendix  A). 


Assumption  3.2:      l^'^T- 


■   converges  in  distribution  to  a 


0(a)  -  £<^(a) 
[ imx  -  q)    -  E(mx  -  q)] 
normally  distributed  random  vector  with  mean  zero  and  covariance  matrix  V. 


Assumption   3.3:      l^'^Y^ 


converges  in  distribution  to 


(p(a)    -  E(f>{a) 
[ imx  -  q)    -   E(mx   -   q) ] 
a  normally  distributed  random  vector  with  mean  zero  and  covariance  matrix  V. 


More  primitive  assumptions  that  imply  the  central  limit  approximations 
underlying  Assumptions  3.2  and  3.3  are  given  by  Gordin  (1969)  and  Hall  and 
Heyde  (1980). 

Let  u  denote  a  selection  vector  with  a  one  in  its  first  position 
followed  by  k  +  £  zeros.  The  limiting  distributions  for  the 
specification-error  bound  estimators  are: 

Proposition  3.2:  Suppose  that  5*0  and  5*0.  Under  Assumptions  2.1  - 
2.3,  3.1  -  3.2,  WTld  -  5]}  converges  to  a  normally  distributed  random 
vector  with  mean  zero  and  variance  u' Vu/(Ad) .  Under  Assumptions  2. 1  -  2.4, 
3.1  and  3.3,  {VT[d  -  5]>  converges  in  distribution  to  a  normally 
distributed  random  variable  with  mean  zero  and  variance  u'Vu/iiS) . 

To  use  Proposition  3.2  in  practice  requires  consistent  estimation  of 
u'Vu    or  u'i^'u.   Consider  the  case  of  u'Vu.        For  each  T  form  the  scalar 
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sequence  {4>  (a  ):  t=l,2,3,  ...  T}  and  use  one  c  the  frequency  zero  spectral 
density  estimators  described  by  Newey  and  West  (1987)  or  Andrews  (1991),  for 
example. 

As  is  shown  in  Appendix  A,  when  the  price  vector  q  is  a  vector  of  real 
numbers  (degenerate  random  variables),  the  asymptotic  distribution  for  Wild 

T 

-  6l>  remains  valid  even  when  the  population  version  of  the  conjugate  maximum 
problem  fails  to  have  a  unique  solution  (Assumption  2.4  is  violated).  In 
this  case,  the  lack  of  identification  of  the  parameter  vector  a  does  not 
alter  the  distribution  theory  for  the  specification-error  bound.  While  this 
=pecial  case  is  of  considerable  interest,  it  rules  out  the  possibil:  y  of 
using  conditioning  information  to  form  synthetic  payoffs  as  described  in 
Section  II. 

Notice  that  if  5  =  0  or  5  =  0,  Proposition  3.2  breaks  down.  This 
occurs  if  y  is  a  valid  stochastic  discount  factor  in  which  case  the 
solutions  to  the  population  conjugate  problems  are  a  =  a  =  0.    As  a 

consequence,  0  (a)  and  ^  (a)  are  both  identically  zero  giving  rise  to  a 

/  *  2  ,  ~  2 

degenerate   limiting   distribution   for   {vT(d  )  }   and   {vT(d  )  >.     Our 

suDsequent  results  on  the  conver,^  ^ce  of  the  parameters  can  be  used  to 

2        ~  2 
establish  that  the  rate  of  convergence  of  {(d)  >  and  {(d)  >  is  T,  and  is 

T  T 

given  by  a  weighted  sum  of  chi-squared  distributions  (see  Vuong  1989).  As  a 
result  id  }  and  {d  >  converge  at  the  rate  VT ,  although  the  limiting 
distribution  is  not  normal. 

III.C:   Asymptotic  Distribution  of  the  Parameter  Estimators 

In  several  situations  it  is  useful  to  examine  the  solutions  to  the 
conjugate  problems  :ed  in  constructing  the  bounds.  For  example,  it  may  be 
of  interest  to  examine  whether  a  particular  asset  or  group  of  assets  are 
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important  in  determining  the  bound  or  it  may  of  interest  to  determine 
whether  the  coefficient  vector  is  zero,  in  which  case  the  bound  is 
degenerate.  In  developing  a  central  limit  approximation  to  do  this  type  of 
statistical  inference,  we  initially  consider  the  case  where  there  are  no 
assets  that  are  subject  to  short  sales  constraints.  In  other  words,  we 
assume  that  the  cone  C  is  R  .  Since,  in  the  absence  of  market  frictions,  the 
estimation  problem  is  posed  as  an  unconstrained  maximization  problem,  the 
limiting  covariance  matrices  for  the  asymptotic  distribution  of  the 
coefficient  estimators  have  a  form  that  is  familiar  both  from  H  estimation 
(e.g.,  see  Huber  1981)  and  from  GMM  estimation  (e.g.,  see  Hansen  1982).  Our 
formal  derivation  of  this  distribution  theory  is  given  in  Appendix  C  and  uses 
a  result  from  Pakes  and  Pollard  (1989).  A  byproduct  from  our  analysis  in  the 
appendix  is  a  (modest)  weakening  of  the  assumptions  imposed  in  Hansen  (1982) 
to  accommodate  kinks  in  the  moment  conditions  used  in  estimation. 
The  population  moment  conditions  of  interest  are: 

(3.11)  £[x(y-x'a)  -  q]  =  0, 

for  the  specification-error  bound  in  which  the  no-arbitrage  restriction  is 
not  fully  exploited,  and 

(3.12)  £[x(y-x'a)*  -  ql  =  0, 

when  the  no-arbitrage  is  exploited.  Equalities  (3.11)  and  (3.12)  are  simply 
the  first-order  conditions  (2.9)  and  (2.10)  for  the  conjugate  maximum 
problems  when  short-sale  constraints  are  not  imposed.  The  sample  analog 
estimators  satisfy: 
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(3.13)   Y^[x(y-x'a^)    -  q]    =  0 


and 


(3.14)   Lj.[x(y-x'a^)*  -  q]  =  0  . 


While  the  equations  for  a  are  linear,  those  for  a  are  nonlinear.  In 
the  latter  case,  we  use  a  linear  approximation  to  the  moment  conditions  in 
deriving  the  central  limit  approximation  for  the  parameters: 

(3.15)    xiy-x'a)*   -  q  «  xiy-x'a.)*   -  q  -   xx' 1^^_^, -^^^  (a-a)   . 

=  x(y-x'a)l,   i~^n\   -  <J   • 
^  {y-x  aaO} 

Notice  that  the  function  of  a  on  the  left  side  of  (3.15)  is  dif ferentiable 
except  at  values  of  a  such  that  y-x'a  =  0.  We  assume  that  such  sample 
points  are  "unusual": 

Assumption  3.4:      Pr{y-x' a   =  0>  =  0. 

To  evaluate  further  the  quality  of  the  approximation  in  (3.15),  let  r(a) 
denote  the  random  approximation  error: 


(3.16)    r(a)   =   l^(y-^' «)  ^^y-^' a^O>-^y-x' a^0> ' 


It  follows  from  the  Cauchy-Schwarz  Inequality  that 
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(3.17)    r(a)   .  \-^y--' -^  \  \  ^Uy-x' a.orUy-x'a.O) 


)\ 


{y-x  aiO}   {y-x  aiO> 
£      U|^|a-a| 


where  the  second  inequality  follows  because  |x'a  -  x'a|  dominates  |x(y-x'a)| 
whenever  y-x'a  and  y-x'a  have  opposite  signs.  Therefore,  the  random 
approximation  error  satisfies: 


(3.18)     r(a)/|a-a|  rs      \x\^ 


for  a  *  a  implying  that  the  modulus  of  differentiability 

(3.19)     dmodic)      =     sup{r(a)/ |a-a|  :  |a-a|<c,  for  a*a> 

2 

is  dominated  by  |x|    Combined  with  Assumption  2.1  this  implies  that  for  any 

positive  value  of  c,  £[dmod(E)]  is  finite.  ks   c   ^    0,  dmodic)    goes  to  zero 

except  when  1,   ,~  „,  =  1.   In  this  case  it  is  possible  to  choose  a  such 
'^        {y-x  a=0> 

that  |a-a|  <  e  and  1,    ,  ,„,  =  1  so  that  r(a)  =  Ixx'l-   However  Assumption 

{y-x  a<0> 

3.4  implies  that  this  occurs  with  probability  zero  so  that  as  c  ^^  0,  dmodic) 
converges  almost  surely  to  zero.  As  is  shown  in  Appendix  C,  these 
restrictions  are  sufficient  for  us  to  study  the  asymptotic  behavior  of  the 
estimator  {a  >  using  the  linearization  on  the  right  side  of  (3.15): 


(3.20)     £[x(y-x'a)l   _^,~^Q^  -  q]      =  0. 
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To  use  linear  equation  system  (3.20)  to  identify  a,  we  need  the 
matrix  E{xx'l,  /''>n\^  *'°  ^®  nonsingular.  Given  Assumption  3.4,  this  rank 
condition  is  equivalent  to  one  in  Assumption  2.4  because  x  and  x  must 
coincide  when  no  short-sale  constraints  are  imposed.  The  counterpart  to 
this  rank  condition  for  a  is  that  the  second  moment  matrix  E{xx'  )  be 
nonsingular  as  required  by  Assumption  2.3. 

Working   with   the   two   linear   moment   conditions,   we   obtain   the 
approximations: 

(3.21)    •T(a^  -  a)   «  "  t^^^^' ^{y-x' aaO  ^  ^  ~^'^'^^^^^^"''' ""^^  "  '^^ 
•T(a  -  a)   «   -[£(xx'  )l"VT^[x(y-x'a)  -  q] 

where  the  notation  ~   is  used  to  denote  the  fact  that  the  differences  between 
the  left  and  right  sides  of  (3.21)  converge  in  probability  to  zero.   These 


app 


roximations  are  justified  formally  in  Appendix  B.    Let  w    =     [0  /  ] 


n 


Combining  approximations  (3.21)  with  Assumptions  3.2  and  3.3  gives  us  the 
asymptotic  distribution  of  the  analog  estimators. 

Proposition   3.3:       Suppose  Assumptions   2.1-2.3,  3.1  and  3.2  are  satisfied. 
Then  {v^T(a-a)>  converges  in  distribution  to  a  normally  distributed  random 

T 

vector  with  mean  zero  and  covariance  matrix:    [£(xx')]   wVw'(E(xx')] 
Suppose  Assumptions  2.1-2.4,  3.1  and  3.3-3.4  are  satisfied.   Then  {v'T(a^-a)} 
converges  in  distribution  to  a  normally  distributed  random  vector  with  mean 
zero  and  covariance  matrix:   [£(xx' 1 ,  _^, ~^q^ ) ]   wVw' [£(xx' 1^^_^, -^^^ ) ]   . 

To  apply  these  limiting  distributions  in  practice  requires  consistent 
estimators  of  the  asymptotic  covariance  matrices.   The  terms  wWw'    and  wWw' 
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can  be  estimated  using  one  of  the  spectral  methods  referenced  previously. 
Under  assumptions  maintained  in  Proposition  3.3,  the  matrices  E(.xx' )  and 
£(xx'l,  _  /~>oy)  can  be  estimated  consistently  by  their  sample  analogs, 
where  the  estimator  a  is  used  in  place  of  a  in  estimating  the  second  of 
these  matrices. 

We  now  briefly  describe  how  the  distribution  theory  is  modified  when 
some  short-sale  constraints  are  imposed  (C  is  a  proper  subset  of  r").  We 
will  focus  on  the  limiting  behavior  of  VT{a  -a),  but  the  results  for 
VTia  -a)  are  very  similar.  As  in  Section  II,  we  partition  x  by  whether  or 
not  m   prices  the  payoffs  with  equality  or  not,  that  is  by  whether 


(3.22)   Emx       =  £q  ,  or  Emx     <  Eq 


The  component  coefficient  estimators  that  multiply  x  's  for  which  there  is 
strict  inequality  will  equal  zero  with  arbitrarily  high  probability  as  the 
sample  size  gets  large.  Hence  the  limiting  distribution  is  degenerate  for 
these  component  estimators. 

Consider  next  the  estimator  of  the  remaining  subvector  of  a,  which  we 
denote  p.  Because  of  the  degeneracy  just  described,  we  can,  in  effect, 
treat  the  limiting  distribution  of  the  estimator  of  p  separately.  Let  C  be 
the  lower-dimensional  cone  associated  with  estimating  p.  If  3  is  an 
interior  point  of  C,  then  the  argument  leading  up  to  Proposition  3.3  can  be 
imitated  to  deduce  a  limiting  normal  distribution  for  the  parameter 
estimator.   However  if  3  is  at  the  boundary  of  the  cone  C,     the  limiting 

distribution  may  be  a  nonlinear  function  of  a  normally  distributed  random 

3 
vector  (see  Haberman  1989,  page  1545). 

As  in  any  econometric  estimation  problem  with  inequality  constraints, 
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the  problematic  feature  of  this  limiting  distribution  theory  is  the  manner 
in  which  1   iepends  on  the  true  parameter  vector  p  including  the  associated 
discontinuities.   This  feature  makes  the  distribution  theory  harder  to  use 
in  practice  and,   in  other  settings,   has  led  researchers  to  compute 
approximate  bounds  on  probabilities  of  test  statistics  (see,  for  example 
Wolak  1991  and  Boudoukh,  Richardson  and  Smith  1992).   Recall,  however,  that 
in  our  derivation  of  the  distribution  theory  for  the  specification-error 
bounds,  we  were  able  to  circumvent  the  need  for  a  distribution  theory  for  the 
parameter  estimators.    Thus  even  though  the  distribution  theory  for  the 
parameter  estimators  becomes  more  complicated  in  the  presence  of  market 
frictions,  the  distribution  theory  for  the  specification-error  bounds  remains 
simple. 

III.D:   Consistent  Estimation  of  the  Arbitrage  Bounds 

As  we  discussed  in  Section  II  the  second  moments  bounds  can  be 
converted  into  standard  deviation  bounds  if  the  mean  of  m  is  known  or  if  it 
can  be  estimated  using  the  price  of  a  risk-free  asset.  When  Em  is  not  known 
it  must  be  prespecif led.  Let  v  be  the  hypothesized  mean  of  m  when  h  risk 
free  asset  is  not  available.   Proposition  3.1  can  be  applied  to  establish 

the  consistency  of  the  second  moment  bound  estimators  for  each  admissible 

~2 
price  assignment  v.        In  the  case  of  5  ,  for  the  price  assignment  to  be 

admissible,  it  must  not  induce  arbitrage  opportunities  onto  the  augmented 

collection  of  asset  payoffs  and  prices.   Any  price  (mean)  assignment  in  the 

open  interval  (A  ,u  )  is  admissible  in  this  sense. 

*^  0   0 

The  final  question  we  explore  in  this  section  is  whether  the  arbitrage 
bounds,  A  and  v     given  in  (2.18)  and  (2.19),  can  be  consistently  estimated 

0        0 

using  the  sample  analogs: 
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(3.23)    i^     =      -infia'Y^q    :  a  €  C 


and 


a'x  i  -1  for  all  t=1.2 T} 


and 


(3.24)   u^  s  infia'Y^q   :    a  e 


C   and 


a'x  i  1  for  all  t=l,2 T> 


The  estimated  upper  arbitrage  bound  u  is  always  finite  when  there  is  a 
payoff  on  a  limited  liability  security  that  is  never  observed  to  be  zero  in 
the  sample.  Our  estimated  range  of  the  admissible  values  for  the  (average) 
price  of  a  unit  payoff  and  hence  mean  of  m  is  [I  ,u  ]  .  Notice  that  these 
bounds  can  be  computed  by  solving  simple  linear  programming  problems.  In 
Appendix  A  we  prove: 


Proposition  3.4:  Under  Assumptions  2.1-2.3  and  3.1,  {i  }  converges  to  X 
almost  surely.  If  u  is  finite,  then  <u  >  converges  to  v  almost  surely; 
and  if  u  =  +oo,  then  {u  }  diverges  to  +oo  almost  surely. 


IV.   Applications  and  Extensions 

In  this  section  we  discuss  several  applications  and  extensions  of  the 
analysis  of  Section  III.  First  we  discuss  consistent  estimation  of  the  set 
of  feasible  means  and  standard  deviations  of  stochastic  discount  factors. 
Previously  we  showed  that  for  a  given  mean  of  the  stochastic  discount  factor, 
the  standard  deviation  bound  can  be  consistently  estimated.  However,  the 
mean  of  the  stochastic  discount  factor  typically  is  not  known.  As  a  result 
it  is  important  to  understand  the  sense  in  which  the  entire  feasible  region 
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can  be  approximated. 

We  next  examine  extensions  of  the  distribution  theory  of  Section  III 
that  are  useful  in  answering  several  questions  about  the  bounds  and  asset 
pricing  models.  First  we  extend  the  analysis  of  Section  III  to  examine 
whether,  given  an  initial  set  of  asset  returns,  additional  asset  returns 
result  in  a  change  in  the  volatility  bound.  We  call  this  set  of  tests  region 
subset  tests.  Snow  (1991)  used  this  type  of  test  to  examine  whether  returns 
on  a  portfolio  of  stocks  of  firms  with  small  capitalization  contained 
additional  information  about  the  volatility  of  stochastic  discount  factors 
over  and  above  that  found  in  the  return  on  the  market  portfolio;  Cochrane  and 
Hansen  (1992)  used  it  to  determine  whether  conditioning  information  is 
important;  Knez  (1993)  used  it  in  his  investigation  of  the  links  between  the 
markets  for  Treasury  bills,  certificates  of  deposit  and  commercial  paper;  and 
De  Santis  (1993)  used  it  to  study  the  significance  of  returns  on  foreign 

securities  vis-a-vis     domestic   securities   in   the   construction  of   the 

4 
volatility  bounds. 

A  particular  example  of  the  region  subset  test  occurs  when  checking 
whether  a  constant  discount  factor  would  correctly  price  the  assets  under 
consideration.  This  is  a  test  of  whether  the  volatility  bounds  have  content 
and  is  an  important  initial  hypothesis  to  examine  since,  if  true,  the  bounds 
do  not  preclude  constant  discount  factors  (risk-neutral  pricing). 

We  then  show  how  the  feasible  regions  for  the  means  and 
standard-deviations  can  be  used  to  test  a  specific  model  of  the  discount 
factor.  Burnside  (1992)  and  Cecchetti,  Lam  and  Mark  (1992)  have  developed  a 
version  of  this  test  when  there  are  no  assets  subject  to  short-sale 
constraints  or  transactions  costs.  We  show  how  this  test  can  be  implemented 
in  a  relatively  simple  manner  by  exploiting  the  results  of  Section  III. 
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Further  we  formulate  the  test  so  that  it  is  also  applicable  when  there  are 
assets  subject  to  short-sale  constraints.  As  a  result  this  provides  (large 
sample)  statistical  foundation  to  the  tests  of  asset  pricing  models  suggested 
by  He  and  Modest  (1992).  and  Luttmer  (1993). 

Finally  we  outline  an  extension  of  the  specification-error  bound 
analysis  that  is  useful  when  the  discount  factor  proxy  under  consideration 
depends  upon  a  vector  of  unknown  parameters.  We  consider  an  estimator  of  the 
parameter  vector  that  minimizes  the  specification-error  bound  and  briefly 
describe  how  to  develop  an  asymptotic  distribution  for  this  estimator  and  for 
the  implied  bound. 

Some  of  the  formal  discussion  in  this  section  focuses  on  the  case  when 
positivity  is  imposed  in  the  construction  of  the  volatility  and 
specification-error  bounds.  Moreover,  when  considering  volatility  bounds,  we 
study  the  more  usual  case  in  which  data  on  the  prices  of  a  unit  payoff  are 
not  used  in  the  econometric  analysis.  Comparable  results  without  positivity 
or  with  a  unit  payoff  require  the  obvious  modifications  and  are  sometimes 
computationally  simpler. 

IV. A:   Consistent  Estimation  of  the  Feasible  Region  of  Means  and  Standard 
Deviations 

As  discussed  in  Section  I  and  II,  it  is  often  of  interest  to  construct 
approximations  to  the  feasible  region  of  ordered  pairs  of  means  and  standard 
deviations  of  stochastic  discount  factors  implied  by  security  market  data. 
Such  a  region  can  be  computed  with  or  without  imposing  the  no-arbitrage 
restriction  that  the  stochastic  discount  factors  be  positive.  Let  S  denote 
the  region  without  positivity  and  S  the  (closure)  of  the  region  with 
positivity.   Similarly,  let  S   and  the  S   denote  the  sample  counterparts. 
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The  question  we  now   turn   to   is   in  what   sense   are  S   and  S   good 
approximations  to  S     and  S  ? 

From  our  results  in  Section  III  we  know  that  when  there  is  a  unit 
payoff,  all  four  regions  are  vertical  rays  because  the  (average)  price  of 
this  payoff  is  the  mean  discount  factor.  In  this  case  the  points  of  origin 
of  the  rays  S  and  S  can  be  estimated  consistently  by  the  points  of  origin 

0        0 

of  the  corresponding  rays  S  and  S  . 

In  the  more  usual  case  when  data  on  a  unit  payoff  and  its  price  is  not 
available,  matters  are  a  little  more  complicated.  The  feasible  regions  are 
no  longer  vertical  rays  but  instead  are  unions  of  such  rays  resulting  in 
convex  sets  with  nonempty  interiors.  The  boundaries  of  these  sets  can  be 
represented  as  (possibly  extended)  real-valued  functions  of  the  ordinate 
(hypothetical  mean),  and  our  previous  analysis  implies  pointwise  (in  the 
mean)  convergence  of  the  sample  analog  functions  to  their  population 
counterparts.  This  result  implies  uniform  convergence  of  the  sample  analog 
functions  in  following  sense. 

Since  the  lower  and  upper  arbitrage  bounds  can  be  consistently 
estimated,  for  large  enough  T,  the  sample  analog  functions  under  positlvity 
are  finite  on  any  compact  subset  of  (A  ,u  ).  When  positivity  is  ignored  the 
functions  are  finite  on  any  compact  subset  of  R.  Further  these  functions  are 
convex  functions  of  the  hypothetical  mean  of  the  discount  factor.  As  a 
result  (see  Theorem  10.8  of  Rockafellar  1970)  the  sample  analog  functions 
converge  uniformly,  almost  surely,  on  any  compact  subset  of  ('^-•^q)  iri  the 
case  of  positivity  and  on  any  compact  set  when  positivity  is  ignored.  One 
difficulty  is  that  the  approximations  deteriorate  as  the  mean  assignment,  v, 
approaches  the  arbitrage  bounds  in  the  case  of  positivity,  or  when  v  gets 
large  when  positivity  is  ignored. 
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The  deterioration  of  the  sample  analog  to  S   in  the  vicinity  of  a 

finite  arbitrage  bound  turns  out  not  to  be  problematic.    To  see  this, 

instead  of  viewing  the  boundaries  of  the  regions  as  functions  of  the 

ordinate,  we  explore  the  approximation  error  from  a  set-theoretic  vantage 

point  in  R  .   Consider  first  the  case  in  which  u  <  +00.   Associated  with  a 
'^  0 

sample  of  size  T  is  an  approximation  error  as  measured  by  the  Hausdorff 
metric: 


(4.1)  T}        =     max{7iS*,S*).7iS*.S*)} 

T  T       0  T       0 


where: 


(4.2)  y(/C  ,K  )      =         sup  inf  \{v  ,w  )-{v  ,w  )| 

{v   ,w   )eK        [v   ,v   )eK 
111  222 


Since  the  arbitrage  bounds  can  be  consistently  estimated  and  the  lower 
boundaries  of  {S  >  approach  the  lower  boundary  of  S  uniformly  on  any 
compact  interval  within  the  arbitrage  bound,  the  approximation  error 
sequence  {tj  }  converges  to  zero  almost  surely. 

Measuring  the  approximation  error  via  the  Hausdorff  metric  allows 
ordered  pairs  to  get  close  without  restricting  them  to  have  the  same 
ordinate.  In  other  words,  we  no  longer  confine  our  attention  to  "vertical" 
measures  of  distance,  as  is  the  case  when  we  view  the  boundaries  of  the 
regions  as  functions  of  the  hypothetical  (expected)  prices  of  a  unit  payoff. 
The  added  flexibility  in  the  Hausdorff  metric  permits  us  to  exploit  better 
the  consistent  estimation  of  the  upper  and  lower  arbitrage  bounds 
(Proposition  3.5). 
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When  V      is  infinite,  the  approximation  error  t}     defined  by  (4.1)  will 

0  T 

be  infinite.   As  a  remedy,  we  replace  r  by 


(4.3)    y  (C  ,C  )  =         sup  inf            l(v.w)-(v,v)| 

P     ^     ^            {v   ,w   )eK  W   ,w  )e/C 

11     1  2   2    2 

Q*v  So  O^v  SO 

1  ^  2  ^ 


where  p  is  any  arbitrary  positive  number  greater  than  the  lower  arbitrage 
bound  A  .   Then  the  modified  approximation  error  will  be  well  defined  and 

0 

finite  for  sufficiently  large  T  and  will  converge  almost  surely  to  zero. 
Thus  we  still  get  uniform  convergence  as  long  as  the  ordinate  is  restricted 
to  a  finite  interval. 


IV. B:   Region  Subset  Tests 

The  first  set  of  tests  we  consider  are  whether  the  volatility  bounds  can 
be  constructed  using  a  smaller  vector  of  security  payoffs.  As  in  section 
II  I.  C,  we  initially  consider  the  case  where  there  are  no  assets  that  are 
subject  to  short-sale  constraints,  and  we  assume  that  the  parameters  are 
uniquely  identified.  Let  z  denote  an  (n-1 ) -dimensional  vector  of  assets 
under  consideration  with  price  vector  s,  and  let  f  be  the  k-dimensional 
vector  including  the  k-1  ass  payoffs  that  are  to  be  used  to  construct  the 
bound  augmented  by  a  unit  payoff.  Formally,  the  hypothesis  of  interest  can 
be  represented  as: 


(4.4)    E[z{.f'9)*  -  s]      =  0, 


£[(f'e)  -  V  ]      =     0. 

0 


One  possibility  is  to  test  this  hypothesis  for  a  prespecified  v  ,     and  the 
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other  is  to  test  whether  it  is  satisfied  for  some  u  >  0. 

0 

Consider  first  the  case  in  which  v  is  prespecif ied.  To  map  this  into 
the  setup  of  Section  III,  form  the  n-dimensional  vector  x  by  augmenting  z 
with  a  unit  payoff  and  form  q  by  augmenting  s  with  the  "price"  v  .  Then 
hypothesis  (4.4)  can  be  interpreted  as  a  zero  restriction  on  the  coefficient 
vector  a  employed  in  Sections  II  and  III.  The  components  of  coefficient 
vector  a  set  to  zero  are  just  the  entries  of  z  that  are  omitted  from  f.  A 
large  sample  Wald  test  of  this  zero  restriction  can  be  formed  by  applying  the 
limiting  distribution  in  Proposition  3.3.  Alternatively,  we  could  construct 
a  test  by  solving  the  GMM   optimization  problem: 

.4.5,    ..„  ^Tj;[^;f;^j:-j-„„v„-,^i-VTj;flj;^j>j  . 


where  [wWw' )  is  one  of  the  spectral  estimators  referenced  in  Section  III. 
Since  this  test  is  embedded  within  a  GMM  estimation  problem,  the  analysis  in 
Section  III.C.  can  be  easily  modified  to  show  that  the  minimized  value  of  the 
criterion  function  is  distributed  as  a  chi-square  random  variable  with  n  -  k 
degrees  of  freedom  (see  Hansen  1982).  When  the  hypothesis  of  interest  is 
modified  to  be  for  some  v  >  0,  this  GMM  approach  is  modified  by  minimizing 
the  criterion  in  (4.5)  by  choice  of  9  and  v  with  a  corresponding  loss  in  the 
degrees  of  freedom. 

One  special  case  of  this  setup  is  a  test  for  risk-neutral  pricing.  In 
this  case  k  is  one  and  f  contains  only  a  unit  payoff.  In  other  words,  a 
constant  discount  factor  prices  the  securities  correctly  on  average  and  the 
volatility  bound  is  zero.  Furthermore,  the  central  limit  approximations  for 
the  volatility  bound  estimator  (without  imposing  risk-neutral  pricing)  is 
degenerate  since  the  second  moment  of  the  discount  factor  is  a  constant.   For 
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both  of  these  reasons,  a  test  for  risk-neutral  pricing  is  a  useful  starting 
point  in  an  empirical  investigation. 

Conducting  such  tests  can  proceed  as  described  with  one  modification. 
There  is  no  sampling  error  associated  with  the  second  moment  condition  in 
(4.4)  implying  that  wVw'  is  singular.  Instead  this  second  condition  should 
be  imposed  (6  =  v  )  and  testing  should  be  based  only  on  the  initial  set  of 
moment  conditions.   When  v      is  not  known,  this  second  condition  should  be 

0 

omitted  and  9   should  be   restricted   to  be   nonnegative  when  solving 
minimization  problem  (4.5). 

Region  subset  tests  without  positivity  turn  out  to  be  closely  connected 
to  tests  of  factor  structure  and  mean-standard  deviation  boundary 
intersection  tests  (e.g.,  see  Braun  1992  and  Knez  1993  for  an  elaboration). 
This  connection  follows  from  the  duality  of  the  mean-standard  deviation 
frontier  for  stochastic  discount  factors  and  the  comparable  frontier  for 
returns  (see  Hansen  and  Jagannathan  1991  for  an  elaboration).  Also,  the  Wald 
and  GMM  test  statistics  coincide  because  the  moment  conditions  are  linear  in 
the  parameter  vector  9. 

The  tests  using  the  criterion  (4.5)  rely  upon  the  distribution  theory  of 
the  estimator  of  a.  To  use  the  theory  of  Section  II  I.  C  requires  that  there 
are  no  securities  subject  to  short-sale  constraints.  As  we  discussed  in  that 
subsection,  the  presence  of  the  inequality  restriction  on  the  parameter 
vector  a  can  complicate  the  distribution  theory  of  the  parameter  estimators. 
As  a  result,  testing  zero  restrictions  on  a  subvector  of  a  when  some  of  the 
remaining  coefficients  are  against  a  nonnegativity  constraint  can  be 
problematic.  On  the  other  hand,  the  results  of  Haberman  (1989)  could  be  used 
to  develop  such  a  test  when  the  zero  restrictions  are  imposed  on  at  least  all 
of  the  securities  to  which  the  short-sale  constraints  apply. 
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IV. C:    Testing  a  Specific  Model  of  the  Discoiint  Factor  using  Volatility 
Bounds 

Suppose  that  in  addition  to  asset  market  data,  a  model  of  the  discount 
factor  is  posited  and  a  time  series  of  observations  of  the  discount  factor  is 

available:   {m  :        t=l T>.   One  way  to  test  the  model  is  to  examine 

whether  it  satisfies  the  volatility  bounds  discussed  in  Sections  II  and  III. 
Since  observations  of  the  discount  factor  are  available,  the  average  price  of 
a  unit  payoff  can  be  estimated  by  the  mean  of  m.  Specifically,  form  x  by 
augmenting  the  original  vector  of  payoffs  with  a  unit  payoff;  form  q  by 
augmenting  the  original  vector  of  prices  with  the  random  variable  m;  and  form 
C  by  constructing  the  the  Cartesian  product  of  the  original  cone  with  R.  In 
effect,  we  had  added  an  unit  payoff  with  an  average  price  m  that  is  not 
subject  to  a  short-sale  constraint.  In  forming  a  test,  we  can  apply  the 
results  of  Section  II  and  III.B  with  one  minor  modification.  The  random 
functions  (p  and  4>  are  now  constructed  by  setting  the  proxy  y  to  zero  and 
subtracting  m   : 


(4.6)     ^(a)   s  -  (-a'x)^  -  2a'q  -  m^  , 


and 


(4.7)     5(a)   s  -  (-a'x)   -  2a'g  -  m        . 

2 
Subtracting  m      does  not  alter  the  solutions  to  either  the  sample  or 

population  maximization  problems.    It  does,  however,  change  the  maximized 

values  of  the  criteria  functions.    The  volatility  bounds  for  Em    will  be 
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satisfied,  if,  and  only  if 


(4.8)     ^  =     max     £^(a)   s  0, 
aeC 


when  positivity  is  ignored,  or 


(4.9)     ?  s  max     E<t>{a)      s  0, 
aeC 


when   positivity   is   imposed.     The   limiting   distribution   reported   in 

Proposition  3.2  (appropriately  modified)  can  be  applied  to  construct  a  test 

~  7 
of  these  hypotheses  using  sample  analog  estimators  of  E,    and  ^.    Again,  we 

have  formulated  the  problem  so  that  approximation  error  due  to  parameter 

estimation  plays  no  role  in  the  limiting  distributions  for  these  sample 

analogs. 

In  practice  we  find  the  solutions  for  the  sample  maximization  problems, 
estimate  the  asymptotic  standard  errors,  and  form  one-sided  tests.  In 
particular,  let  c  be  the  maximized  value  of  7  (0)  over  the  constraint  set  C. 
Then  {^T[c  -  £,]  converges  in  distribution  to  a  normal  random  variable  with 
mean  zero  and  variance  u'l^u.  This  variance  can  be  estimated  in  the  manner 
described  in  Section  II  I. B.  Since  |  is  not  specified  under  the  null 
hypothesis  (4.9),  the  "conservative"  choice  of  ^  =  0  is  used  in  constructing 
the  test  statistic. 

Finally  when  there  are  no  transactions  costs,  short  sales-constraints  or 
other  constraints  to  be  considered,  the  asymptotic  distribution  of  the 
estimators  can  be  used  to  construct  a  different  test,  (analogous  to  a 
Likelihood  Ratio  test)  that  exploits  the  inequality  restriction  in  the 
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first-stage  estimation  of  the  bound.  Considering  the  case  when  positivity  is 
imposed,  the  parameter  vector  a  that  solves  problem  (4.9)  satisfies  the 
moment  conditions: 

(4.10)  Elx(x'a)*   -  q]      =  0  . 

The  inequality  restriction  (4.9)  implies  the  further  moment  condition: 

(4.11)  E{[0(a)  -  |]}  =  0. 

for  some  ^  s  0. 

Without  a  constraint  on  the  parameter  £,,  moment  conditions  (4.10)  and 
(4.11)  exactly  identify  a  and  ^.  However,  with  the  restriction  that  ^  be 
nonpositive,  we  can  set  up  a  GMM  criterion  in  the  parameter  vector  (a,^) 
using  the  moment  conditions  (4.10)  and  (4.11)  and  minimize  this  function 
subject  to  the  constraint  that  ^  £  0.  If  the  population  moments  of  m  are  on 
the  boundary  of  the  feasible  region  S  (that  is,  when  |  =  0),  then  the 
appropriately  scaled  minimized  GMM  criterion  function  has  a  limiting 
distribution  that  has  probability  one-half  of  being  zero  and  the  remaining 
half  of  the  probability  is  allocated  according  to  a  chi-square  one 
distribution  with  one  degree  of  freedom.  This  chi-square  distribution  then 
bounds  the  distribution  of  the  GMM  test  statistics  (under  the  null 
hypothesis)  for  other  negative  values  of  ^.  When  positivity  of  the 
stochastic  discount  factor  is  ignored  and  the  (p    random  function  is  used  in 

place  of  0,  it  can  be  shown  that  the  resulting  test  statistic  coincides  with 

g 
the  test  statistic  based  solely  on  (4.6). 

Similar  approaches  to  testing  a  model  of  the  discount  factor  can  be 
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applied  when  a  time  series  for  m  can  be  constructed  from  simulated  data 
instead  of  actual  data.  In  this  case  the  randomness  of  $(a)  can  be 
decomposed  additively  int  wo  components,  one  due  to  the  randomness  of  the 
security  market  payoffs  and  prices  and  the  other  due  to  the  simulation  of  m. 
As  in  the  work  of  McFadden  (1989),  Pakes  and  Pollard  (1989),  Lee  and  Ingram 
(1991)  and  Duffie  and  Singleton  (1993),  the  asymptotic  variance  in  the 
limiting  distribution  will  now  have  an  extra  component  due  to  the  sampling 
error  induced  by  simulation. 

When  the  first  two  moments  of  m    can  be  computed  numerically  with  an 

arbitrarily  high  degree  of  accuracy,  we  can  proceed  as  follows.   Augment  the 

2 
price  vector  with  Em    instead  of  m   and  subtract  Em     from  the  criteria  instead 

2 
of  m     as  in  (4.6)  and  (4.7).   This  same  strategy  can  be  employed  to  assess 

the  accuracy  of  the  estimated  feasible  region  for  means  and  standard 

deviations   of   stochastic   discount   factors.     For   any   hypothetical 

mean-standard  deviation  pair  for  m,    one  can  compute  the  corresponding  test 

statistic  and  probability  value. 

IV. D:   Minimizing  the  Specification-Error  Bound  for  Paurameterized  Families  of 
Models 

Recall  that  the  specification-error  bounds  provide  a  way  to  assess  the 
usefulness  of  an  asset  pricing  model  even  when  it  is  technically 
misspecif led.  In  many  situations  the  discount  factor  proxy  depends  on 
unknown  parameters.  For  example,  in  a  representative  consumer  model  with 
constant  relative  risk  aversion  preferences,  the  pure  rate  of  time  preference 
and  the  coefficient  of  relative  risk  aversion  are  typically  unknown.  In  this 
case  ,;ne  way  to  estimate  the  parameters  of  the  model  is  to  minimize  the 
specification  er  ;r.    Alternatively,   in  an  observable  factor  model,  the 
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discount  factor  proxy  depends  on  a  linear  combination  of  the  factors  with 
unknown  coefficients.  As  in  the  work  of  Shanken  (1987)  one  could  imagine 
selecting  factor  coefficients  to  minimize  the  specification  error.  We  now 
sketch  how  the  results  of  Section  III.B  extend  in  a  straightforward  manner  to 
obtain  a  distribution  theory  for  the  minimized  value  of  specification-error 
bound. 

Suppose  that  the  discount  factor  proxy  y  depends  on  the  parameter  vector 
P  €  S  where  B  is  a  compact  set.  The  population  optimization  problems  of 
interest  are  now: 


(4.12)  5^     =     min     max    [syO)^  -  Eiiyifi)    -  x'a)^>  -  Za'Eql, 


fi€B     a&C 


and 


(4.13)    5^  =     min     max    |£y(|3)^  -  £{[(yO)  -  x'a)*]^}  -  Za'Eql   . 

When  5  and  5  are  strictly  positive  and  the  parameterized  family  of  stochastic 
discount  factors  satisfies  the  appropriate  smoothness  and  moment 
restrictions,  an  extended  version  of  Theorem  3.2  can  be  obtained  for  the 
sample  analog  estimators  of  5  and  5.  Again  the  limiting  distribution  will  be 
the  same  as  if  the  solutions  to  the  population  optimization  problems  were 
known  a  priori. 

The  approach  can  be  extended  to  compare  the  smallest 
specification-errors  for  two  nonnested  families  of  models.  Such  a  comparison 
potentially  can  be  used  as  a  device  for  selecting  between  the  two  families  of 
models.   Vuong  (1989)  examined  a  very  similar  problem  by  using  the  large 
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sample  behavior  of  likelihood  ratios  for  two  nonnested  families  of 
misspecified  models  (in  particular,  see  the  discussion  in  Section  5  of  Vuong 
1989);  and  we  can  imitate  and  adapt  his  analysis  to  our  problem.  More 
precisely,  let  5  and  5  denote  the  specification-error  bounds  associated 
with  two  such  families.   Take  the  null  hypothesis  to  be: 


(4.14)    5=5 

1        2 


Under  the  null  hypothesis,  the  smallest  specification-error  associated  with 
each  parameterized  family  is  the  same.  As  a  consequence  the  performance  of 
the  two  parameterized  families  can  not  be  ranked  once  sampling  error  is 
accounted  for.  This  hypothesis  can  be  tested  by  using  the  corresponding 
distribution  theory  for  the  difference  between  the  analog  estimators  of  5 
and  5  scaled  by  the  square  root  of  the  sample  size. 

Finally,  we  sketch  the  distribution  theory  for  the  coefficient 
estimators  when  there  is  a  parameterized  family  of  discount  factor  proxies. 
Suppose  that  the  unique  solution  p  of  (4.13)  is  contained  in  the  interior  of 
B,  the  parameterization  family  satisfies  the  appropriate  smoothness  and 
moment  restrictions,  and  no  short-sale  constraints  are  imposed.  The 
population  moment  conditions  are  given  by: 


(4.15)   E{x[y(.^)-x'a]*  -  q}      =  0; 


and 

(4.16)   £[^(p){y(p)  -  [y(p)  -  a'xl*}1   =  0. 
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The  distribution  theory  for  the  analog  estimators  {b   }  and  {a  >  of  8  and  a 

T  T 

respectively  can  be  deduced  by  taking  linear  approximations  to  the  sample 
moment  conditions  (4.15)  and  (4.16)  and  appealing  to  the  results  of  Appendix 
C. 

V.   Concluding  Remarks 

In  this  paper  we  provided  statistical  methods  for  assessing  asset 
pricing  models  using  specification-error  and  volatility  bounds.  In 
developing  these  procedures,  it  was  advantageous  to  exploit  duality  theory 
and  represent  the  measurements  of  interest  as  solutions  to  unconstrained 
conjugate  maximization  problems.  This  duality  approach  simplifies  both 
computations  and  statistical  inferences.  The  resulting  statistical 
procedures  can  account  for  market  frictions  due  to  transactions  costs  or 
short-sale  constraints,  and  are  often  easier  to  interpret  than  standard  tests 
of  asset  pricing  models.  For  the  most  part  these  methods  are  quite  easy  to 
implement,  even  when  market  frictions  are  considered.  They  are  designed  to 
provide  a  better  understanding  of  the  statistical  failures  of  some  popular 
asset  pricing  models  and  to  offer  guidance  in  improving  these  models. 

Among  other  things,  the  results  in  this  paper  allow  one  to  do  the 
following:  (i)  to  test  whether  a  specific  model  of  the  stochastic  discount 
factor  satisfies  the  volatility  bounds  implicit  in  asset  market  returns;  (ii) 
to  compare  the  information  about  the  means  and  standard  deviations  of 
discount  factors  contained  in  different  sets  of  asset  returns;  and  (iii)  to 
test  hypotheses  about  the  size  of  possible  pricing  errors  of  misspecified 
asset  pricing  models.  An  advantage  of  (i)  is  that  the  resulting  test  is 
robust  to  misspecif ication  of  the  Joint   distribution  of  asset  returns  and  the 
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stochastic  discount  factor.  In  regards  to  (ii),  our  results  permit  these 
comparisons  among  data  sets  to  be  made  independent  of  a  specific  stochastic 
discount  factor  model.  Our  motivation  for  (iii)  is  to  shift  the  focus  of 
statistical  analyses  of  asset  pricing  models  away  from  whether  the  models  are 
correctly  specified  and  towards  measuring  the  extent  to  which  they  are 
misspecif led. 
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Appendix  A:   Consistency 

In  this  appendix  we  demonstrate  formally  the  results  of  Section  I II. A, 
and  III.D.   We  maintain  Assumptions  2.1-2.3,  and  3.1  throughout. 

Let  11  denote  a  compact  set  in  r".  For  any  subset  h  of  U,  we  let  cHh) 
denote  the  closure  of  h.  Let  K  denote  the  collection  of  all  nonempty  closed 
subsets  of  11.      We  use  the  Hausdorff  metric  t)  on  K  given  by 


(A.  1)     Ti(h  ,h  )  =     maxi    sup       inf    I  a  -  a  I,  sup   inf    I  a  -  a  |>  . 
^  a  eh  a  €h  a  eh  a  eh 

112    2  2   2    11 


to  define  notions  of  convergence  of  compact  sets.  For  some  of  our  results  we 
will  use  the  construct  of  a  Jim  sup  of  a  sequence  in  K.  We  follow 
Hildenbrand  (1974)  and  define: 

Definition  A.l:      For  a  sequence  {h  >  in  11,    lim  sup  h       =     n  cl    (.   u  h  )    . 

J  J     m     J^  J 

Since  the  lim  sup  is  the  intersection  of  a  decreasing  sequence  of  closed 
sets,  it  is  closed  and  not  empty.  An  alternative  way  to  characterize  the  lim 
sup  is  to  imagine  forming  sequences  of  points  by  selecting  a  point  from  each 
h  .  All  of  the  limit  points  of  convergent  subsequences  are  in  the  Jim  sup, 
and,  in  fact,  all  of  elements  of  the  lim  sup  can  be  represented  in  this 
manner. 

We  shall  make  reference  to  an  implication  of  a  Corollary  on  page  30  of 
Hildenbrand  (1974)  that  characterizes  the  set  of  minimizers  of  an 
"approximating"  function  over  an  "approximating  set." 
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Lemma  A.l:      Suppose 


(i)       {^  >  is  a  sequence  oi  continuous  functions  mapping  11    into  R  that 


converges  uniformly  to  i/» 


and 


(ii)      {h  )   converges  to  h   . 
J  " 


Then  lim   si.       i  c   g      where  .^   =  {u  e  h   :  i/(  (u)  ^  i//  (u' )  for  all  u'  €  h  }, 

and  lim  min   v^  =  min   t//  . 

h        '  h   " 

J  " 


Proof:      To  verify  that  this  follows  from  the  Corollary  in  Hildenbrand,  let  ^ 

denote  the  set  of  positive  integers  augmented  by  +«,  and  endow  }   with  the 

usual  metric  for  a  one-point  compactif ication.   Then  in  light  of  (i),  the 

sequence  {\b  )    in  conjunction  with  i/»  defines  a  continuous  function  on  J  x  Ii; 
J  " 

and  in  light  of  (ii),  the  sequence  {h  >  in  conjunction  with  h  defines  a 
continuous  compact  correspondence  mapping  }  into  Ii.  The  conclusion  of  the 
Lemma  Al  then  follows  from  the  Corollary  together  with  part  (ii)  of 
Proposition  1  of  page  22  in  Hildenbrand.  Q.E.D. 

Turning   to   the   result   in   Section   III. A,   we   formally   establish 
Proposition  3.1: 

Proof  of   Proposition  3.1: 

We  treat  only  the  consistency  of  {d   }  because  the  corresponding  argument 
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for  {d  }  is  very  similar.  Assumptions  2.1  and  3.1  imply  that  iY4>(.<x.)} 
converges  almost  surely  to  E^(a)  for  each  aeC.  Since  for  each  T,  7  ^  is 
concave  as  is  E4>,  Theorem  10.8  of  Rockafellar  implies  that  {7^}  converges 
uniformly  on  any  compact  set  in  R  .  Further  as  argued  in  Section  II,  the  set 
of  maximizers  of  E4>   is  bounded.   For  a  positive  number  N,  define  C     =   {a  e  C 

N 

:  |a|  i  N>  and  D     =    {a    e    C    :     \a\     =N}.   Then  C     and  D     are  compact.   By 

N  N        N 

choosing  N  to  be  sufficiently  large  we  can  ensure  that  C     contains  all  of  the 

N 

maximizers  of  E4>  over  the  constraint  set  C  and  that  none  of  the  maximizers 
are  in  D  .  Let  5  be  the  maximized  value  of  E0  over  D  .  Then  by  choice  of  N 
we  have  that  5   <  5.   Since  {T  ^^  converges  uniformly  to  E4>    on  C      almost 

N  "T  N 

surely,  for  sufficiently  large  T,  the  maximizers  of  V.^  over  C  are  also  not 
in  D  .  By  the  convexity  of  C  and  concavity  of  7  ^,  it  follows  that  for 
sufficiently  large  T,  the  maximizers  of  T  ^  over  C  coincide  with  those  over 
C.  Consequently,  the  almost  sure  convergence  of  id  }  to  5  follows  from  the 
almost  sure  uniform  convergence  of  {7  ^^  °^  ^  •  Q-E.D. 

We  now  turn  to  the  results  in  Section  III.D  and  investigate  the 
statistical  consistency  of  sample  analog  estimators  {t  )  and  {u  }  for  the 
arbitrage   bounds  X       and  v  .  Recall   that   the   arbitrage   bounds   are 

representable  as  solutions  to  linear  programming  problems.  Since  there  is  no 
natural  compact  set  for  the  choice  variables  in  these  problems,  we  must 
explore  "directions  to  infinity."  We  study  these  "directions"  using  a 
compactif ication  of  the  parameter  space. 

First  consider  any  a  €  C  such  that  a'x  £  -1  with  probability  one.  Then 
with  probability  one  a'x  >  -1  for  all  t  with  probability  one  and  {a'T  g> 
converges  almost  surely  to  a'Eq.  Define  £  -  ~  (■  and  X  =  -X  .  Since 
f        s        a'y  g,  it  follows  that  liin    sup    f        £   X      with  probability  one. 
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Similarly,  if  u  <  oo,  lim  sup  a     ^   v  .      Hence  our  interest  is  in  the  lim   i 


nf 


I    and  lim  inf   u  . 

T  T 

To  construct  a  compact  parameter  space,  we  map  the  original  parameter 
space  for  each  problem  into  the  closed  unit  ball  in  R  which  we  denote  as  \L. 
We  consider  explicitly  the  case  of  u  .  The  proofs  for  the  case  of  i  are 
completely  analogous  to  the  case  for  u  and  are  omitted. 

Notice  that  the  constraint  set  used  in  defining  v     can  be  represented  as 

0 

the  set  of  all  a  e  C  satisfying  the  equation: 


(A. 2)     E((l  -  a'x)  ]  =  0  . 

Consider  now  a  transformation  of  the  parameter  space  by  mapping  the  parameter 
space  into  the  unit  ball.  The  mapping  C,  =  (x/(l  +  |a|)  maps  r"  into  the  open 
unit  ball.  To  compactify  the  transformed  parameter  space,  we  consider  adding 
the  boundary  points  of  the  unit  ball.  Notice  that  we  can  recover  the 
original  parameterization  by  considering  the  inverse  mapping: 

(A. 3)     a  =   C/(l  -  K|) 

for  1^1  <  1.  Using  the  transformation  in  (A.  3),  instead  of  considering  those 
a' s  that  satisfy  (A. 2)  we  consider: 

(A. 4)     D       s   {  ^  e  Vf^    I  £{[(1  -  Kl)  -  x'C]*>   =  0  > 

This  transformation  potentially  adds  solutions  to  (A. 2)  by  including  the 
boundary  of  the  unit  ball.   The  potentially  problematic  values  of  ^  are  those 
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for  which  x' C,   s  0,  <€C  and  I  (^  I  =  1 .   We  rule  this  out  by  limiting  attention 
to  values  of  ^  in 


(A. 5)     D  H  {  <  e  line  I  C'£q  s  (I-ICDCUq+D  > 


Notice  that  any  C,  in  D  for  which  Kl  *  1  satisfies  C.' Eq/[l-\<;\)  s  (u  +1).  In 
effect  by  focusing  on  ^' s  in  D  we  are  eliminating  <' s  corresponding  to 
payoffs  with  "high"  prices.  This  does  not  cause  us  problems  because  we  are 
concerned  with  estimated  upper  arbitrage  bounds  that  are  too  low,  not  too 
high.  Also,  any  ^  in  D  for  which  \C,\  =1  must  have  an  (average)  price  that 
is  nonpositive.  This  eliminates  the  troublesome  points  (directions)  from  D  . 
Let  D  be  the  sample  analog  of  D  and  D  be  the  sample  analog  of  D.  We  first 
consider  the  limiting  behavior  of  D  r\  D  : 

^  T     T 


Lemma  A.  2:      Suppose  that  u  <  oo.   Then  iim  sup  D     r\  D       c     D     r\  D. 


Proof:        First  notice  that  since  Yq    converges  to  Eq    almost  surely,  then 
7)(D  ,D)  converges  almost  sure  to  0.   We  next  establish  that  Iim  D     =  D  .      To 

T  T 

do  this  we  first  show  that  V  [(1  -  1^1)  -  x'<]*  converges  uniformly  to 
£[(1  -  ICI  )  -  x'C]*  on  IZ.   Note  that  li  n  C  is  compact  and  that: 


(A. 6)     £J|[(1  -  KJ)  -  x'Cj*  -  [(1  -  KJ)  -  ^'<2^*l} 

^  (1  .  (£|x|^)^^^)K^-  C3I 


This  is  sufficient  for  the  Uniform  Law  of  Large  Numbers  of  Hansen  (1982)  to 
apply.   Hence  from  Lemma  A.l,  the  Iim   sup   of  the  sequence  of  minimizers  of 
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y  [(1  -  Kl)  -  x' C,]*  over  U  n  C  is  contained  in  the  set  of  minimizers  of 
£[(1  -  ICI)  -  x' C,]* .  Since  v  <  m,  the  set  D  is  not  empty  and  D  is  the  set 
of  minimizers  of  £[(1  -  |<|)  -  x' C,]  .  With  probability  one,  any  point  in  t? 
must  also  be  in  D  for  all  Tsl.  Since  D  is  separable,  a  common  probability 
measure  one  set  of  sample  points  can  be  selected  so  that  DSD  for  all  Tsi. 
As  a  result  lim  D    =  D  .   The  conclusion  follows.  Q.E.D. 

T 


Lemma  A.  3:      Suppose  that  u  <  co.   Then  lim  inf   u  2  u  . 
'^^  0  TO 


Proof:      First  note  that 

(A. 7)     u  =  mini    C'L.q/(l-lCl)  I   C  e  D*  a  D  }  for  sufficiently  large  T, 

and 

V     =  mini   C'Eq/(l-|Cl)|  C,  €  D*  r\  D   }    . 

0 

Hypothetical  expansions  of  the  constraint  set  D  n  D  for  u  can  only  result 
in  smaller  values  of  the  maximized  criterion.  For  instance,  suppose  the 
constraint  set  is  augmented  to  include  all  of  the  points  in  D  r\  D.  Then 
Lemma  A. 2  implies  that  this  sequence  of  augmented  constraint  sets  converges 
to  D  r\  D.      The  conclusion  then  follows  from  Lemma  A.l.  Q.E.D. 


Finally,  we  consider  the  case  in  li  =  oo. 


Lemma  A. 4:      Suppose  that  v     =  a.      Then  {u  >  diverges  with  probability  one. 


Proof:       Since  u  =  oo,  there  are  no  values  of  a  €  C  such  that  a'x  s  1  with 

0 
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probability  one.  Consequently,  the  only  values  of  C,  in  D  are  ones  for  which 
1^1  =  1.  We  consider  two  cases.  First  suppose  that  D  =  a.  The  uniform 
convergence  of  J^[(l  -  Kl)  -  x'<]*  to  £[(1  -  Kl)  -  x'^]*  implies  that  for 
sufficiently  large  T,  D  =  0  and  u  =  oo.  Next  suppose  that  D  *  0.  Since 
there  are  no  arbitrage  opportunities  (Assumption  2.2),  C,' Eq  >  0  for  any  ^  in 
D  such  that  ll^'xll  >  0.  Also,  Assumption  2.2  together  with  the  no-redundancy 
Assumption  2.3  imply  that  ^'£g  >  0  for  any  (^  in  D  such  that  ll^'xll  =  0. 
Furthermore,  D     is  closed  implying  that 

(A. 8)     c  s  infiCEq   :  <  e  D*}  >  0  . 

Since  {7  q)  converges  to  Eq   almost  surely  and  D     converges  almost  surely  to 

D  ,     it  follows  from  Lemma  A.  1  that  with  probability  one  for  sufficiently 

•  •      • 

large  T,  C'IL<7  >  c/2  for  all  C,   e  D  .       The  convergence  of  {D  }    to  D     coupled 

with  the  fact  that  all  elements  of  D     have  norm  one  then  implies  that  {u  > 

diverges  almost  surely.  Q.E.D. 

Taken  together.  Lemmas  A. 2,  A. 3  and  A. 4  imply  Proposition  3.4. 

Appendix  B:   Asymptotic  Distribution  of  Bounds  Estimators 

In  this  appendix  we  show  that  in  the  case  in  which  the  prices  of  the 
payoffs  are  constant,  the  asymptotic  distribution  of  the  estimated  bounds  can 
be  demonstrated  even  when  the  parameter  vector  is  not  uniquely  identified 
(even  when  Assumption  2.4  is  not  satisfied). 
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Proof  of  Proposition  3.2: 

We  consider  the  case  of  d  .   The  case  of  d  is  similar.   Let  h  be  the 

T  T  T 

set  of  maximizers  of  T4>   and  let  h     be  the  set  of  maximizers  of  £^.   For  each 
T,  let  a  be  a  measurable  selection  from  h   (see  Theorem  1  of  Hildenbrand 

T  T 

(1974),  page  54).   Since  Jim  sup    h      =    h      almost  surely  and  h      is  compact, 
there  is  a  sequence  {a  >  in  h      such  that  lim    |a  -a  |  =0  almost  surely  (see 

^  T        00  T   T 

Appendix  A).   Further  an  implication  of  Lemma  A.  1  of  Hansen  and  Jagannathan 
(1991)  is  that  all  a  e  h   result  in  the  same  random  variable  m   =  (y-a'x)*. 

00 

Also  (2.12)  implies  that  for  a  e  h  ,  a' q   =  E{y(y-a' x)*    -    (y-a'x)*  },  so  that 

00 

a' a  is  the  same  for  Ul  <x  e  h    .      As  a  result  the  random  variable  0(a)  is  the 

^  00 

same  for  all  ash  .   Now  consider  the  decomposition  of  v^TF  [(d  )  -  5  ]  as  in 
(3.7): 


(B.l)    •T[(d  )^  -  5^]  =  ■/T^.[0(a^)  -  ^(a^)]  +  •JVQ.~4>Coi^'\    -  £0(a^)] 


As  in  relation  (3.10),  we  have: 

(B.2)    0  £  •JVi^\.~^{'k^^    -  0(a^)] 

£  /TT  [(mx  -  g)  -  £(mx  -  q)\-Ca.^  -   a^] 


Since  |a  -a  |  converges  almost  surely  to  0,  the  result  follows.  Q.E.D. 


Appendix  C:   Asymptotic  Distribution 

In  this  appendix  we  consider  the  asymptotic  distribution  of  our 
parameter  estimator.  We  begin  by  demonstrating  that  restrictions  used  in 
Hansen  (1982)  can  be  extended  along  the  lines  of  Pollard  (1985)  and  Pakes  and 
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Pollard  (1989)  to  accommodate  "kinks"  in  the  functions  used  to  represent  the 
moment  conditions.  Then  from  the  discussion  in  Section  III.C, 
it  is  straightforward  to  show  that  Proposition  3.3  follows  from  the  main 
result  in  this  appendix. 

The  notation  used  in  this  appendix  conflicts  with  some  of  the  notation 
used  elsewhere  in  the  paper.  We  let  3  denote  the  parameter  vector  of 
interest  and  p  any  hypothetical  point  in  the  underlying  parameter  space  T . 
The  parameter  space  is  restricted  to  satisfy: 

Assumption  C.l:    T   contains  an  open  ball  in  R  about  S  . 

We  will  use  the  construct  of  a  random  function.   A  random  function  i/»  maps  the 
set  of  sample  points  into  the  space  of  vector-valued  continuous  functions  on 
T .      We  require  that  i//0)  be  an  n-dimensional  random  vector  for  each  p  in  T . 
We  also  consider  an  approximating  function 


0!'O)   =  ^,0  )  +  A.  (3-3  ) 


that  is  linear  3.   The  composite  random  function  satisfies: 


Assumption  C.2:       { (i/»,  '  ,i/'f '  ) '  }  is  stationary  and  ergodic  and  has  finite  first 
moments. 


We 


now  specify  the  sense  in  which  i/»   is  required  to  approximate  ^  . 


The  approximation  error  induced  by  using  is  0   in  place  of  i/»  is 


r^O)  =  I0^O)  -  0^O)I  . 


54 


Define: 


dmodA5)   =  sup{r.  0)/l3-3  I    :     ip-p  l<6.    P*P  }    . 

t  t  o  o  o 


Note  that  dmodA-)    is  monotone  in  5.   Therefore,  we  can  take  almost  sure 
limits  as  5  declines  to  zero.   We  impose  the  following  restrictions  on  mod    . 


Assumption  C.3:      lim  dmod AS)    =   0  almost  surely. 
5^0 


Assumption  C.4:      Eldmod AS]]    <   oo  for  some  5  >  0. 


To  satisfy  Assumptions  C.3  and  C.4,  A  is  typically  taken  to  be  the  matrix  of 
partial  derivatives  of  i/»  at  3  when  i/»  is  differentiable  at  P  and  be  well 
behaved  for  the  other  sample  points.  The  random  variable  mod  (5)  is 
interpreted  as  the  modulus  of  differentiability   for  i//.  at  p  . 

to 

The  approach  adopted  in  Hansen  (1982)  is  to  restrict  the  modulus  of 
continuity  of  the  derivative  of  i/»  to  converge  almost  surely  to  zero  and  to 
have  a  finite  expectation  for  some  neighborhood  of  the  parameter.  It  follows 
from  the  Mean-Value  Theorem  that  restrictions  imposed  in  Hansen  (1982)  on  the 
local  behavior  of  ip      imply  Assumptions  C.3  and  C.4. 

We  use  Assumptions  C.3  -  C.4  to  study  the  sense  in  which  JAP  ^^ 
stochastically  differentiable.   Hence  look  at  the  approximation  error 


G^(5)  =     sup   {\Y^<p{fi)    -  Y^iP^i!i)\/\p-pJ    :  ||3-^J<5.p*p^> 


By  the  Triangle  Inequality  we  have  that 
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:  (5)   s  l^dmod(S) 


Thus  by  Assumptions  C.1-C.2,  we  have  that 


(CD     lim  lim  sup   c  (5)  s  lim     Edmodid) 

5^  0    T^oo  5^  0 

=  0. 


This  in  turn  implies  the  stochastic  differentiability  condition  in  Pollard 

(1985)  because  the  counterpart  to  c  (5)  in  Pollard's  condition  is  scaled  by 

v^TI3-3  1/(1  +  VTO-P  I),  which  is  less  than  one.   Also,  the  iterated  limit 

o  o 

in  (C.l)  implies  the  limit  taken  in  Pollard's  condition  because  e   is 

T 

monotone  in  5.  The  differentiability  of  limiting  moment  function  £i/»  follows 
directly  from  Assumption  C.4.  Therefore,  7^  -  Ei/i  satisfies  the  stochastic 
differentiability  condition  with  derivative  at  p  given  by  r.A-£A.  Since 
{i/(^}  is  stationary  and  ergodic,  {7  A-£A>  converges  almost  surely  to  zero 
hence  the  derivative  is  asymptotically  negligible. 

Next  we  impose  a  global  identification  condition  on  the  approximating 
function  i/»^.  Since  the  approximation  of  i//  by  i//  is  local,  this  condition 
can  also  be  viewed  as  a  local  identification  condition  on  the  original 
function  tp.  . 


Assumption  C.5:    £|A  |<oo  and  £A  has  full  rank  k. 


This   rank  condition  on   the  derivative   together   with   the   stochastic 
differentiability  conditions  already  established  imply  the  equicontinuity 
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condition   (iii)   in  Theorem  3.3  of  Pakes  and  Pollard   (1989)   (see  the 
discussion  on  page  1043  of  Pakes  and  Pollard). 

We  study  the  behavior  of  an  estimator  b      that  solves  the  equations: 


^^^(b. 


)   =  0 


for  sufficiently  large  T.   The  (k  x  n)  random  matrix  a  selects  the  linear 
combination  of  moment  conditions  to  be  used  in  estimation. 


Assumption  C.6:      {b   }   converges  in  probability  to  ^  . 


Assumption    C.7:         {a  }  converges  in  probability  to  a  nonrandom  matrix  a 
where  a  £A  is  nonsingular. 


Finally,  to  obtain  a  limiting  distribution  for  {b_}  we  assume: 


Assumption     C.8:  {/TV  00  )>   converges   in  distribution   to  a  normally 

1    o 

distributed  random  vector  with  mean  zero  and  nonsingular  covariance  matrix 
V  . 

0 

Sufficient  conditions  for  Assumption  C.8  can  be  obtained  using  martingale 
approximations  as  described  by  Gordin  (1969),  Hall  and  Heyde  (1980)  and 
Hansen  (1985).   This  condition  implies  that  £i//0  )  is  equal  to  zero. 

The  following  extension  of  Theorem  3.1  in  Hansen  (1982)  is  now  a  direct 
consequence  of  Theorem  3.3  and  Lemma  3.5  in  Pakes  and  Pollard  (1989). 
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Theorem     C.l:  Suppose   that  Assumptions  C.1-C.8  are  satisfied.    Then 

Wl{h  -&   )>  converges  in  distribution  to  a  normally  distributed  random  vector 
with  mean  zero  and  covariance  matrix  [a  £(A. )]  aV  a  '  [£(A,' )a  ]~  . 

0    t      0  0  0       to 


Estimation  of  EL.  follows  as  in  Hansen  (1982)  as  long  as  A  can  be  expressed  in 
terms  of  a  random  matrix  function  D  that  satisfies  A  =  DO  )  where  D  is 
continuous  at  p  with  probability  one  and  has  a  modulus  of  continuity  with  a 
finite  first  moment  for  some  5  >  0.  In  this  case,  {TDCb  )>  converges  in 
probability  to  EL. 
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Footnotes 

A  weaker  version  of  this  restriction  would  replace  Eq  by  q.  In  effect, 
Assumption  2.3  does  more  than  eliminate  redundant  securities.  It  also 
precludes  cases  in  which  distinct  portfolio  weights  give  rise  to  the  same 
payoff,  possibly  different  prices  but  the  same  expected  prices. 


2 
Formally,  the  pricing-error  interpretation  for  least  squares  problem  (2.6)  is 

S     =  inf      sup      \Emp  -  Eyp\       , 
meM     p&P 
Ep^  =  l 


and  for  (2.7)  is 


S     =      inf      sup      \Emp  -  Eyp\ 
m^M*  peH 
£p^=l 


where  //  is  a  complete  set  of  derivative  claims  on  the  payoffs  in  P. 

3 
Haberman  characterized  this  nonlinear  function  as  a  particular  projection 

onto  a  closed  convex  set  formed  by  translating  C   by  -p.   Although  Haberman 

(1989)   only   considers   the   case   in   which   the   data   are   iid,   his 

characterization  of  the  limiting  distribution  applies  more  generally  with  a 

covariance  matrix  replaced  by  a  spectral  density  matrix  at  frequency  zero. 
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4 
The  impetus  for  this  work  was  the  econometric  discussion  in  an  unpublished 

precursor  to  this  paper:  Hansen  and  Jagannathan  (1988). 


The  Hausdorff  metric  is  usually  employed  for  compact  sets  to  ensure  that 
the  resulting  distance  is  finite.  Because  of  the  vertical  character  of  the 
regions  and  the  existence  of  finite  arbitrage  bounds,  the  Hausdorff  distance 
will  be  finite  even  though  the  sets  are  not  bounded. 


The  Euclidean  distance  in  (4.2)  could  be  replaced  by  the  square  root  of  a 
quadratic  form  in  the  differences  between  two  points  as  long  as  a 
positive  weight  is  given  to  both  dimensions. 


7 
Even  if  hypothesis  (4.9)  is  satisfied,  the  sample  analog  may  be  infinite, 

making  implementation  problematic.    This  happens  when  the  sample  mean  is 

outside  the  estimated  arbitrage  bounds.   This  phenomenon  does  not  arise  for 

hypothesis  (4.8). 

Q 

Burnside  (1992)  and  Cecchetti,  Lam  and  Mark  (1992)  developed  and  studied 
alternative  versions  of  the  volatility  bounds  tests  when  no  transactions 
costs  are  introduced.  The  test  used  by  Cochrane  and  Hansen  (1992)  abstracted 
from  positivity  and  can  be  formulated  equivalently  using  <(>  in  (4.6).  See 
Burnside  (1992)  for  a  Monte  Carlo  comparison  of  various  volatility  tests 
including  the  ones  proposed  here. 
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